WO2020192020A1 - 滤波方法、装置、编码器以及计算机存储介质 - Google Patents
滤波方法、装置、编码器以及计算机存储介质 Download PDFInfo
- Publication number
- WO2020192020A1 WO2020192020A1 PCT/CN2019/104499 CN2019104499W WO2020192020A1 WO 2020192020 A1 WO2020192020 A1 WO 2020192020A1 CN 2019104499 W CN2019104499 W CN 2019104499W WO 2020192020 A1 WO2020192020 A1 WO 2020192020A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- components
- processing
- filtered
- component
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/423—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
Definitions
- the embodiments of the present application relate to the technical field of video image processing, and in particular, to a filtering method, device, encoder, and computer storage medium.
- the preprocessing filter is used to preprocess the original image to reduce the video resolution, because the video resolution that needs to be encoded is lower than the resolution of the original video, so you can use less Bit representation, which can improve the overall coding efficiency;
- the post-processing filter processes the in-loop filtered video to output the video to improve the video resolution.
- neural network-based filters are often composed of multiple basic units.
- the input of neural network is single input or multiple input, that is, single image component or multiple image components when input, that is, the complexity of existing convolutional neural network is relatively high. High, and the current convolutional neural network (CNN, Convolutional Neural Network) filter does not make full use of relevant information, which makes the improvement of reconstructed image quality limited.
- CNN Convolutional Neural Network
- the embodiments of the present application provide a filtering method, a device, an encoder, and a computer storage medium, which can reduce the complexity of the neural network-based filtering method and help improve the image quality of the reconstructed image.
- an embodiment of the present application provides a filtering method, the method including:
- the at least two components of the pixel information to be filtered and the at least one kind of side information are input into a neural network-based filter to output at least one component after filtering the pixel information to be filtered.
- an embodiment of the present application provides a filtering device, and the filtering device includes:
- the first obtaining module is configured to obtain pixel information to be filtered
- the second obtaining module is configured to obtain at least one type of side information
- a determining module configured to input at least two components of the pixel information to be filtered and the at least one kind of side information into a neural network-based filter to output at least one of the filtered pixel information to be filtered Weight.
- an encoder in a third aspect, provides an encoder, and the encoder includes:
- an embodiment of the present application provides a computer storage medium, in which executable instructions are stored, and when the executable instructions are executed by one or more processors, the processors execute one or more of the foregoing The filtering method described in one embodiment.
- the embodiments of the present application provide a filtering method, device, encoder, and computer storage medium.
- the filtering device obtains pixel information to be filtered, obtains at least one type of side information, and combines at least two components and at least one of the pixel information to be filtered.
- the seed side information is input into the filter based on the neural network to output at least one component of the pixel information to be filtered after filtering; that is, in the embodiment of the present application, at least two components of the pixel information to be filtered are acquired, And at least one kind of side information is input into a neural network-based filter for processing.
- the side information of at least one component is incorporated to obtain filtered pixel information.
- Figure 1 is a schematic structural diagram of a traditional coding block diagram
- Figure 2 is a schematic structural diagram of a traditional decoding block diagram
- FIG. 3 is a schematic flowchart of an optional filtering method provided by an embodiment of this application.
- FIG. 4 is a schematic structural diagram of a block division matrix provided by an embodiment of the application.
- FIG. 5 is a schematic structural diagram of a traditional CNN filter provided by an embodiment of the application.
- 6A is a schematic structural diagram of another traditional CNN filter provided by an embodiment of the application.
- 6B is a schematic diagram of the composition structure of another traditional CNN filter provided by an embodiment of the application.
- FIG. 7 is a schematic structural diagram of an optional filtering framework provided by an embodiment of the application.
- FIG. 8 is a schematic structural diagram of another optional filtering framework provided by an embodiment of the application.
- FIG. 9 is a schematic structural diagram of yet another optional filtering framework provided by an embodiment of this application.
- FIG. 10 is a schematic structural diagram of an optional filtering device provided by an embodiment of this application.
- FIG. 11 is a schematic structural diagram of an optional encoder provided by an embodiment of the application.
- the video to be coded includes the original image frame, and the original image frame includes the original image, and the original image is processed in various ways, such as prediction, transformation, quantization, reconstruction, and filtering.
- the processed video image may have shifted in pixel value relative to the original image, causing visual impairment or artifacts.
- adjacent coding blocks use different coding parameters (such as different transformation processes, different QPs, different prediction methods, and different Reference image frame, etc.), the size of the error introduced by each coding block and its distribution characteristics are independent of each other, and the discontinuity of the boundaries of adjacent coding blocks produces blocking effects.
- FIG 1 is a schematic structural diagram of a traditional coding block diagram.
- the traditional coding block diagram 10 may include a transform and quantization unit 101, an inverse transform and inverse quantization unit 102, a prediction unit 103, an in-loop filtering unit 104, and an entropy coding unit 105 and other components; wherein the prediction unit 103 further includes an intra prediction unit 1031 and an inter prediction unit 1032.
- the coding tree unit (CTU, Coding Tree Unit) can be obtained through preliminary division, and the content adaptive division of a CTU can be continued to obtain the CU.
- a CU generally contains one or more coding blocks (CB, Coding Block).
- the prediction unit 103 is also used to convert the selected intra prediction data or The inter-frame prediction data is provided to the entropy coding unit 105; in addition, the inverse transform and inverse quantization unit 102 is used for reconstruction of the coding block, reconstructing the residual block in the pixel domain, and the reconstructed residual block is filtered in the loop
- the unit 104 removes the block artifacts, and then adds the reconstructed residual block to the decoded image buffer unit to generate a reconstructed reference image; the entropy encoding unit 105 is used for encoding various encoding parameters and quantization Transform coefficients, such as the entropy coding unit
- the in-loop filter unit 104 is a loop filter, also called an in-loop filter (In-Loop Filter), which may include a de-blocking filter (DBF, De-Blocking Filter). ), sample adaptive compensation (SAO, Sample Adaptive Offset) filter and adaptive loop filter (ALF, Adaptive Loop Filter), etc.
- In-Loop Filter an in-loop filter
- DPF de-blocking filter
- SAO Sample Adaptive Offset
- ALF Adaptive Loop Filter
- the preprocessing filtering unit 106 is used to receive the input original video frame, and perform preprocessing filtering on the original image frame in the original video frame to reduce the resolution of the video.
- the post-processing filtering unit 107 is used to receive the in-loop filtered video frame, and perform post-processing filtering on the in-loop filtered video frame to improve the resolution of the video, so that less bits can be used in the video encoding and decoding process to obtain reconstruction Video frames, which can improve the overall coding and decoding efficiency.
- the input of the neural network currently used by both the preprocessing filter and the postprocessing filter is single input or multiple inputs, that is, a single image component or multiple image components are input, which is the complexity of the existing convolutional neural network High, and the current CNN filter does not fully utilize relevant information, which makes the improvement of the reconstructed image quality limited.
- Figure 2 is a schematic structural diagram of a traditional decoding block diagram.
- the traditional decoding block diagram 20 may include an entropy coding unit 201, an inverse quantization and inverse transformation unit 202, a prediction unit 203, and an in-loop
- the filtering unit 204 and the post-processing filtering unit 205 are components; wherein, the prediction unit 203 further includes an intra prediction unit 2031 and an inter prediction unit 2032.
- the video decoding process is the opposite process to the video encoding process, in which the post-processing filtered image obtained in the video decoding process is determined as the reconstructed video frame. It can be seen from Figure 2 that in the decoding process Does not involve the preprocessing filter unit in the encoding process.
- the embodiment of the application provides a filtering method, which is applied to a filtering device.
- the filtering device can be set in the pre-processing filter and the post-processing filter in the encoder, or in the post-processing filter of the decoder.
- the embodiments of the present application do not make specific limitations.
- FIG. 3 is a schematic flowchart of an optional filtering method provided by an embodiment of this application.
- the filtering method may include:
- S303 Input at least two components of the pixel information to be filtered and at least one type of side information into the neural network-based filter to output at least one component after filtering the pixel information to be filtered.
- the above-mentioned pixel information to be filtered refers to the image block to be filtered represented by the pixel value
- the image block to be filtered includes three image components
- the above-mentioned at least two components may be any two of the three image components or
- the image components can include a first image component, a second image component, and a third image component; in the embodiment of the present application, the first image component represents the luminance component, and the second image component represents the first chrominance component,
- the third image component represents the second chrominance component as an example for description.
- the aforementioned at least one type of side information may be at least one of the side information corresponding to the first type of image component, the side information corresponding to the second type of image component, and the side information of the third type of image component.
- the original image frame can be divided into CTU or CTU into CU; that is to say, the block division information in the embodiment of this application may refer to CTU division information or CU division information;
- the filtering method of the embodiment of the present application can be applied not only to pre-processing filtering or post-processing filtering at the CU level, but also to pre-processing filtering or post-processing filtering at the CTU level, which is not specifically limited in the embodiment of the present application.
- the pixel information to be filtered is the original image block in the video to be encoded expressed in pixel values, or the pixel information to be filtered is the image obtained after the in-loop filtering process of the video to be encoded expressed in pixel values in the video encoding process. Piece.
- the reference image of the image to be encoded for video encoding can also be processed by inverse transformation and inverse quantization, reconstruction and filtering. That is to say, the pixel information to be filtered can be the image block in the original image just input into the encoder. This is the case when it is applied to the pre-processing filter, or it can be obtained just after in-loop filtering.
- the image block, which is applied to the post-processing filter, is used to obtain the pixel information to be filtered.
- At least two components of the pixel information to be filtered and at least one type of side information can be obtained.
- the first image component, the second image component, and the third image component are generally used to characterize the original image or the image to be filtered.
- the three image components are a luminance component, a blue chrominance (color difference) component, and a red chrominance (color difference) component; specifically, the luminance component usually uses the symbol Y Indicates that the blue chrominance component is usually represented by the symbol Cb or U; the red chrominance component is usually represented by the symbol Cr or V.
- At least one component represents one or more of the first image component, the second image component, and the third image component
- the at least two components may be the first image component and the second image component
- the third image component can also be the first image component and the second image component, or the first image component and the third image component, or even the second image component and the third image component.
- VVC next generation video coding standard
- VTM VVC Test Model
- VTM VVC Test Model
- the current standard test sequence adopts the YUV 4:2:0 format.
- Each frame of the video to be encoded in this format can be composed of three components: a luminance component Y and two chrominances Components U and V.
- the size information corresponding to the first image component is H ⁇ W
- the size information corresponding to the second image component or the third image component is both
- the embodiment of the present application will take the YUV 4:2:0 format as an example for description, but the filtering method of the embodiment of the present application is also applicable to other sampling formats.
- the size information of the first image component and the second image component or the third image component are different, in order to combine the first image component and/or the second image component and/or The third image component is input into the neural network-based filter at one time. At this time, the three components need to be sampled or recombined to make the spatial size information of the three components the same.
- pixel rearrangement processing may be performed on high-resolution image components, so that the spatial size information of the three components is the same. Specifically, before S302, for at least two components of the pixel information to be filtered, high-resolution image components may be selected, and pixel rearrangement processing may be performed on the high-resolution image components.
- the three components included in the original image are the original image components before other processing is performed. If the first image component is the luminance component, the second image component is the first chrominance component, and the third image component is the second chrominance component; then the high-resolution image component is the first image component.
- the image components undergo pixel rearrangement processing.
- the original image with a size of 2 ⁇ 2 as an example, convert it into 4 channels, that is, arrange the 2 ⁇ 2 ⁇ 1 tensor into a 1 ⁇ 1 ⁇ 4 tensor; then when the original image is the first
- the size information of an image component is H ⁇ W, it can be converted to The form of; because the size information of the second image component and the third image component are both In this way, the spatial size information of the three image components can be the same; the three image components after the pixel rearrangement processing are subsequently combined and transformed into Input pre-processing filter or post-processing filter in the form of.
- a low-resolution component may be selected, and the low-resolution component may be up-sampled.
- the low-resolution components can also be up-sampled (That is, upward adjustment).
- upsampling processing not only can upsampling processing be performed, but also deconvolution processing can be performed, and even super-resolution processing can also be performed. The effects of these three processings are the same, and the embodiment of the present application does not specifically limit it.
- the three components contained in the original image are the original image components before other processing is performed. If the first image component is the luminance component, the second image component is the first chrominance component, and the third image component is the second chrominance component; then the low-resolution image component is the second image component or the third image component. It is necessary to perform up-sampling processing on the second image component or the third image component.
- the size information of the second image component and the third image component of the original image are both Before filtering, it can be converted into H ⁇ W form by upsampling; since the size information of the first image component is H ⁇ W, this can also make the spatial size information of the three image components the same.
- the second image component after upsampling processing and the third image component after upsampling processing will maintain the same resolution as the first image component.
- side information can be used to assist filtering and improve filtering quality.
- the side information can be not only block division information (such as CU division information and/or CTU division information), but also quantization parameter information, or even motion vector (MV , Motion Vector) information, prediction direction information, etc.; these information can be used as side information alone or in any combination as side information, for example, block division information alone as side information, or block division information and quantization parameter information together as side information , Or the block division information and MV information are used together as side information, etc., which is not specifically limited in the embodiment of the present application.
- the at least two types of components and at least one type of side information are input into the neural network-based filter for processing, where the components may be included
- Processing methods such as processing, fusion processing, joint processing, and branch processing are not specifically limited in the embodiment of the present application.
- At least two components of the pixel information to be filtered and at least one type of side information are used as input values and input into a neural network-based filter, and at least one component of the pixel information to be filtered can be output.
- the structure of the neural network includes at least a joint processing stage and an independent processing stage; in the joint processing stage, all components are processed together; in the independent processing stage, each component is processed on an independent branch of the neural network.
- S303 may include:
- S3031 Process each of the at least two components separately to obtain at least two components after processing
- S3032 Perform fusion processing according to at least one kind of side information and at least two processed components to obtain fusion information of the pixel information to be filtered;
- S3033 Process the fusion information to obtain at least one component filtered by the pixel information to be filtered.
- the processing process can be regarded as a splitting stage for obtaining at least two components separately; the fusion information includes at least information obtained by fusing at least two components, S3032, the processing procedure can be regarded as a merging stage, It is used to fuse at least two components; in this way, the embodiment of the present application adopts a cascade processing structure.
- S3031 may include:
- Component processing is performed on each of the at least two components to obtain at least two components after processing.
- the original image component YUV of the original image block is obtained, and Y, U, and V are respectively processed to obtain the YUV of the image block to be filtered, that is, at least two components of the image block to be filtered can be YU, or YV, or UV.
- S3031 may include:
- the first side information includes at least block division information and/or quantization parameter information.
- the first side information can be used for auxiliary filtering to improve filtering quality.
- the first side information can be not only block division information (such as CU division information and/or CTU division information), but also quantization parameter information, or even It can be motion vector (MV, Motion Vector) information, prediction direction information, etc.; these information can be used as the first side information alone, or can be combined arbitrarily as the first side information, for example, block division information is used as the first side information alone, or The block division information and the quantization parameter information are used together as the first side information, or the block division information and the MV information are used together as the first side information, etc., which are not specifically limited in the embodiment of the present application.
- S3031 can be regarded as the first shunt stage.
- component processing such as deep learning
- the first side corresponding to each component The information is added to the corresponding components to obtain at least two components after processing; that is, for the first branching stage, the first side information may or may not be fused, which is not used in this embodiment of the application. Specific restrictions.
- the CU division information may be used as the block division information corresponding to each component of the pixel information to be filtered.
- the first value is filled in each pixel position corresponding to the CU boundary , Fill in the second value in other pixel positions to obtain the first matrix corresponding to the CU division information; where the first value is different from the second value; take the first matrix as the block corresponding to each component of the pixel information to be filtered Divide information.
- the first value can be a preset value, letter, etc.
- the second value can also be a preset value, letter, etc.
- the first value is different from the second value; for example, the first value can be set to 2.
- the second value can be set to 1, but the embodiment of the present application does not specifically limit it.
- the CU division information may be used as the first side information to assist the pixel information to be filtered for filtering processing. That is to say, in the process of performing video encoding on the original image in the video to be encoded, the CU division information can be fully utilized to merge it with at least two components of the pixel information to be filtered and then guide filtering.
- the CU division information is converted into a coding unit map (CUmap, Coding Unit Map), and represented by a two-dimensional matrix, that is, the CUmap matrix, that is, the first matrix in the embodiment of this application; that is, the original Take the first image component of the image as an example, it can be divided into multiple CUs; each pixel position corresponding to the boundary of each CU is filled with the first value, and other pixel positions are filled with the second value, so Then, a first matrix reflecting CU partition information can be constructed.
- FIG. 4 is a schematic structural diagram of a block division matrix provided by an embodiment of this application. As shown in FIG.
- each pixel position corresponding to each CU boundary is filled with 2 and other pixel positions are filled with 1, that is, the pixel filled with 2
- the point position indicates the boundary of the CU, so that the CU division information, that is, the first side information corresponding to the first image component of the pixel information to be filtered can be determined.
- the CU division information of the first image component is different from that of the second or third image component.
- the CU division information may be different. Therefore, when the CU division information of the first image component is different from the CU division information of the second image component or the third image component, the CU division information corresponding to the first image component of the pixel information to be filtered and the pixel to be filtered need to be determined respectively.
- the CU division information corresponding to the second image component or the third image component of the information then it is used as the first side information to be fused into the corresponding first image component or second image component or third image component; when the first When the CU division information of the image component is the same as the CU division information of the second image component or the third image component, only the CU division information of the first image component or the second image component or the third image component can be determined at this time, and then the As the first side information, the CU division information is fused into the corresponding first image component, second image component, or third image component; in this way, it is convenient to subsequently fuse at least two new components obtained to treat
- the filtered pixel information is subjected to pre-processing filtering or post-processing filtering.
- the first side information corresponding to each component may be determined based on the original image in the video to be encoded, and the quantization parameter corresponding to each of the at least two components of the original image block may be obtained, and the quantization The parameter is used as the quantization parameter information corresponding to each component of the pixel information to be filtered.
- using the quantization parameter as the quantization parameter information corresponding to each component of the pixel information to be filtered may be to separately establish a second matrix with the same size as each component of the original image; wherein, the second matrix The position of each pixel in the original image is filled with the normalized value of the quantization parameter corresponding to each component of the original image; the second matrix is used as the quantization parameter information corresponding to each component of the pixel information to be filtered.
- the filter network can adaptively have the ability to process any quantization parameter during the training process.
- the quantization parameter information may also be used as the first side information to assist the image block to be filtered in filtering processing. That is to say, in the process of video encoding the original image in the video to be encoded, the quantization parameter information can be fully utilized to merge it with at least two components of the pixel information to be filtered and then guide filtering.
- the quantization parameter information can be normalized, and the quantization parameter information can also be non-normalized (such as classification processing, interval division processing, etc.); the following will take the quantization parameter normalization processing as an example for detailed description.
- the quantization parameter information is converted into a second matrix reflecting the quantization parameter information; that is to say, taking the first image component of the original image as an example, a matrix with the same size as the first image component of the original image is established.
- the position of each pixel in the matrix is filled with the normalized value of the quantization parameter corresponding to the first image component of the original image; among them, the normalized value of the quantization parameter is represented by QP max (x, y), namely:
- QP represents the quantization parameter value corresponding to the first image component of the original image
- x represents the abscissa value of each pixel position in the first image component of the original image
- y represents the first image of the original image
- QP max represents the maximum value of the quantization parameter.
- the value of QP max is 51, but QP max can also be other values, such as 29, 31, etc., this application implements The examples are not specifically limited.
- S3031 may include:
- the second side information is different from the first side information.
- S3032 may include:
- S3032 may include:
- the first side information and the second side information can be used for auxiliary filtering to improve filtering quality.
- the first side information and the second side information may be one or more of block division information, quantization parameter information, MV information, and prediction direction information; in other words, when the first side information is block division information, the first side information
- the second side information may be quantization parameter information; or, when the first side information is quantization parameter information, the second side information may be block division information; or, when the first side information is block division information and quantization parameter information, the first
- the second side information may be MV information; or, when the first side information is block division information, the second side information may be quantization parameter information and MV information; the embodiment of the present application does not specifically limit it.
- the fusion stage of the first side information and the second side information may be the same or different. It is assumed that the first branching stage is used to indicate the processing stages corresponding to at least two components of the pixel information to be filtered, and the merging stage is used to indicate the processing stage corresponding to the fusion information of the pixel information to be filtered.
- the second branching stage It is used to determine the processing stage corresponding to the residual information of each component after the fusion processing.
- the fusion stage of the first side information can be any one of the first branching stage, the merging stage or the second branching stage, and the fusion stage of the second side information can also be the first branching stage, the merging stage, or the first branching stage.
- the information fusion stage may also be the first branching stage; or, the fusion stage of the first side information may be the merging stage, and the fusion stage of the second side information may also be the merging stage; the embodiment of the application does not specifically limit it.
- S3033 may include:
- Joint processing and branch processing are performed on the fusion information to obtain residual information corresponding to at least one of the at least two components;
- At least one component of the at least two components and the residual information corresponding to the at least one component are summed to obtain at least one component after filtering the pixel information to be filtered.
- pre-processing filtering or post-processing filtering in the embodiments of the present application adopts a multi-stage cascade processing structure, such as a branch-merge-split processing structure, a branch-merge processing structure, or a merge-split processing structure.
- the route processing structure, etc. are not specifically limited in the embodiment of the present application.
- the first branching stage if you first need to obtain at least two components of the pixel information to be filtered, that is, the first branching stage, and then merge the at least two components, that is, the merging stage; in this way, after all the information fusion processing, when needed
- the residual information corresponding to the first image component and the second image component are obtained by joint processing the fusion information
- the residual information corresponding to the corresponding residual information and the residual information corresponding to the third image component, and then the residual information corresponding to the first image component and the first image component are summed, and the second image component and the second image component are combined.
- the corresponding residual information is summed, and the residual information corresponding to the third image component and the third image component is summed to obtain the first image component filtered by the pixel information to be filtered and the pixel information to be filtered after filtering.
- the second image component of the second image component and the third image component filtered by the pixel information to be filtered, the processing process is the second branching stage; then the entire pre-processing filtering or post-processing filtering process adopts branch-combination-branch processing structure;
- the pre-processing filtering or post-processing filtering in the embodiments of the present application may also adopt more cascaded processing structures, such as branch-merge-split-merge-split processing structures.
- the embodiments of the present application may adopt a typical cascade structure, such as a split-merge-split processing structure, or a cascade processing structure that is less than a typical cascade structure, such as a split -Merge processing structure or merge-split processing structure, etc.; even more cascaded processing structures than typical cascade structures can be used, such as branch-merge-split-merge-split processing structure, etc., this application
- the embodiments are not specifically limited.
- the pre-processing filter or the post-processing filter may include a convolutional neural network filter.
- the pre-processing filter or the post-processing filter may be a convolutional neural network filter, or may be another filter established by deep learning, which is not specifically limited in the embodiment of the present application.
- the convolutional neural network filter also called CNN filter, is a type of feedforward neural network that includes convolution calculations and has a deep structure, and is one of the representative algorithms of deep learning.
- the input layer of the CNN filter can process multi-dimensional data, such as the three image component (Y/U/V) channels of the original image in the video to be encoded.
- FIG. 5 is a schematic structural diagram of a traditional CNN filter provided by an embodiment of the application.
- the traditional CNN filter 50 is based on the previous generation video coding standard H.265/High Efficiency Video Coding (HEVC) It is improved on the basis of Video Coding. It contains a 2-layer convolutional network structure, which can replace the deblocking filter and the sample adaptive compensation filter.
- HEVC High Efficiency Video Coding
- the image to be filtered (represented by F in ) is input to the input layer of the traditional CNN filter 50, it passes through the first layer of convolutional network F 1 (assuming that the size of the convolution kernel is 3 ⁇ 3 and contains 64 feature maps) And the second layer convolutional network F 2 (assuming that the size of the convolution kernel is 5 ⁇ 5 and contains 32 feature maps), a residual information F 3 is obtained ; then the image to be filtered F in and the residual information F 3 Perform a summation operation, and finally obtain the filtered image output by the traditional CNN filter 50 (denoted by F out ).
- the convolutional network structure is also called residual neural network, which is used to output residual information corresponding to the image to be filtered.
- the traditional CNN filter 60 the three image components of the image to be filtered are processed independently, but the same filter network and related parameters of the filter network are shared.
- FIG. 6A is a schematic structural diagram of another traditional CNN filter provided by an embodiment of this application
- FIG. 6B is a schematic structural diagram of another traditional CNN filter provided by an embodiment of this application, see FIG. 6A and FIG. 6B, the traditional CNN
- the filter 60 uses two filter networks.
- the filter network shown in FIG. 6A is dedicated to outputting the first image component
- the filter network shown in FIG. 6B is dedicated to outputting the second image component or the third image component.
- the size information corresponding to the first image component is H ⁇ W.
- the first image component can be rearranged and converted to The form of; because the size information corresponding to the second image component or the third image component is Then the three image components are combined and transformed into The form is input to the traditional CNN filter 60.
- the filtering network shown in Figure 6A after the input layer network receives the image to be filtered F in (assuming the size of the convolution kernel is N ⁇ N, the number of channels is 6), it passes through the first layer of convolutional network F 1-Y (Assuming that the size of the convolution kernel is L1 ⁇ L1, the number of convolution kernels is M, and the number of channels is 6) and the second layer of convolutional network F 2-Y (assuming the size of the convolution kernel is L2 ⁇ L2, convolution After the number of cores is 4 and the number of channels is M), a residual information F 3-Y is obtained (assuming that the size of the convolution kernel is N ⁇ N, the number of channels is 4); then the input image to be filtered F in and The residual information F 3-Y is summed, and finally the
- the input layer network receives the image to be filtered F in (assuming the size of the convolution kernel is N ⁇ N, the number of channels is 6), it passes through the first layer of convolution network F 1-U (Assuming that the size of the convolution kernel is L1 ⁇ L1, the number of convolution kernels is M, and the number of channels is 6) and the second layer of convolution network F 2-U (assuming the size of the convolution kernel is L2 ⁇ L2, convolution After the number of cores is 2, the number of channels is M), a residual information F 3-U is obtained (assuming the size of the convolution kernel is N ⁇ N, the number of channels is 2); then the input image to be filtered F in and The residual information F 3-U performs a summation operation, and finally obtains the filtered second image component or the filtered third image component output by the traditional CNN filter 60 (denoted by F out-U ).
- the traditional CNN filter 50 shown in FIG. 5 or the traditional CNN filter 60 shown in FIG. 6A and FIG. 6B since the relationship between different image components is not considered, it is not reasonable to process each image component independently;
- coding parameters such as block division information and QP information are not fully utilized at the input.
- the distortion of the reconstructed image mainly comes from blockiness, and the boundary information of blockiness is determined by the CU division information; that is, CNN
- the filter network in the filter should focus on the boundary area; in addition, the integration of quantization parameter information into the filter network also helps to improve its generalization ability, so that it can filter any quality of distorted images.
- the filtering method provided by the embodiments of the present application not only has a reasonable CNN filter structure setting, and the same filter network can receive multiple image components at the same time, but also fully considers the relationship between these multiple image components.
- the enhanced images of these image components can be output at the same time; in addition, the filtering method can also incorporate coding parameters such as block division information and/or QP information as coding information for auxiliary filtering, thereby improving filtering quality.
- S3031 in the embodiment of the present application may be for the first image component, second image component, and third image component of the pixel information to be filtered to determine the side information corresponding to each component (such as The first side information or the second side information), three image components can be obtained after fusion processing; it is also possible to determine the side information corresponding to each component for the first image component and the second image component of the pixel information to be filtered, Two image components can be obtained after fusion processing; it is also possible to determine the side information corresponding to each component for the first image component and the third image component of the pixel information to be filtered, and two types of components can be obtained after fusion processing; It is also possible to determine the side information corresponding to each component for the second image component and the third image component of the pixel information to be filtered, and after fusion processing, two new components can be obtained; the embodiment of the application does not specifically limit it.
- the fusion information for the pixel information to be filtered can be obtained by directly fusing at least two components, or at least two components and corresponding side information (such as first side information or second side information). Information) obtained by fusion together; the embodiments of this application are not specifically limited.
- the first image component, the second image component, and the third image component of the pixel information to be filtered can be fused to obtain the fusion information; or
- the first image component and the second image component of the pixel information to be filtered are fused to obtain the fusion information; it can also be the first image component and the third image component of the pixel information to be filtered to obtain the fusion information; It may be that the second image component and the third image component of the pixel information to be filtered are fused to obtain fused information.
- the fusion information is obtained by fusing together at least two kinds of components and corresponding side information (such as the first side information or the second side information), then the first image component, the second image component, and the second image component of the pixel information to be filtered can be combined.
- the third image component and the side information are fused to obtain the fusion information;
- the first image component, the second image component and the side information of the pixel information to be filtered can also be fused to obtain the fusion information;
- the first image component, the third image component and the side information of the pixel information are fused to obtain the fusion information; even the second image component, the third image component and the side information of the pixel information to be filtered can be fused to obtain Fusion information.
- the corresponding coding information such as the first side information or the second side information
- the corresponding coding information such as the first side information or the second side information
- S303 in the embodiment of the present application specifically addresses various components of the pixel information to be filtered (such as the first image component, the second image component, and the third image component) and side information (such as the first side information or After the second side information) is fused and input to the filter, it can output only the filtered first image component, or the filtered second image component, or the filtered third image component of the pixel information to be filtered, or it can be output
- the pixel information to be filtered is the filtered first image component and the filtered second image component, or the filtered second image component and the filtered third image component, or the filtered first image component and the filtered first image component
- the three image components may even be the filtered first image component, the filtered second image component, and the filtered third image component of the pixel information to be filtered; the embodiment of the present application does not specifically limit it.
- FIG. 7 is an optional filtering framework provided by an embodiment of the present application As shown in Fig.
- the filtering framework 70 can include three components of the pixel information to be filtered (represented by Y, U, V) 701, a first branching unit 702, and first side information 703, Y Image component first processing unit 704, U image component first processing unit 705, V image component first processing unit 706, second side information 707, input fusion unit 708, joint processing unit 709, second branching unit 710, Y Image component second processing unit 711, U image component second processing unit 712, V image component second processing unit 713, first adder 714, second adder 715, third adder 716, and filtered three images Components (represented by Out_Y, Out_U, and Out_V, respectively) 717.
- the three image components 701 of the image block to be filtered pass through the first demultiplexing unit 702, they are divided into three signals: Y image component, U image component and V image component.
- the first Y image The component and the corresponding first side information 703 enter the Y image component first processing unit 704, the second U image component and the corresponding first side information 703 enter the U image component first processing unit 705, and the third The V image component and the corresponding first side information 703 enter the V image component first processing unit 706, which will output three new image components;
- the input fusion unit 708 is used to combine the three new image components with
- the second side information 707 is fused, and then input to the joint processing unit 709;
- the joint processing unit 709 includes a multi-layer convolution filter network for convolution calculation of the input information, due to the specific convolution calculation process and related technologies The solutions are similar, so the specific execution steps of the joint processing unit 709 are not described again.
- the joint processing unit 709 After the joint processing unit 709, it will enter the second demultiplexing unit 710 to divide it into three signals again, and then input the three signals to the second processing unit 711 of the Y image component and the second processing unit of the U image component.
- the unit 712 and the second processing unit 713 of the V image component can sequentially obtain the residual information of the Y image component, the residual information of the U image component, and the residual information of the V image component; the three image components of the image block to be filtered are 701
- the Y image component and the obtained residual information of the Y image component are input to the first adder 714.
- the output of the first adder 714 is the filtered Y image component (indicated by Out_Y);
- the U image component in the image components 701 and the obtained residual information of the U image component are input to the second adder 715 together.
- the output of the second adder 715 is the filtered U image component (denoted by Out_U);
- the V image component of the three image components 701 of the filtered image block and the obtained residual information of the V image component are input to the third adder 716 together.
- the output of the third adder 716 is the filtered V image component (using Out_V Said).
- the filtering framework 70 may not include the second demultiplexing unit 710, the second adder 715, and the third adder 716; if only the filtered Y image component needs to be output
- the filtering framework 70 may not include the second demultiplexing unit 710, the first adder 714, and the third adder 716; if it is necessary to output the filtered Y image component and the filtered U image component, the filtering framework 70 may not include the third adder 716; the embodiment of the present application sets no specific limitation.
- FIG. 8 is a structure of another optional filtering framework provided by an embodiment of the application
- the filtering framework 80 may include two components of the pixel information to be filtered (represented by Y and U) 801, a first branching unit 702, first side information 703, and a first Y image component.
- the two image components 801 of the image block to be filtered pass through the first demultiplexing unit 702, they will be divided into two signals: the Y image component and the U image component, the first Y image component and the
- the corresponding first side information 703 enters the Y image component first processing unit 704, the second U image component and the corresponding first side information 703 enter the U image component first processing unit 705, which will output two new channels
- the input fusion unit 708 is used to fuse the two new image components, and then input to the joint processing unit 709; after the joint processing unit 709, only a single image component (ie filtered Y image Component), there is no need to enter the second demultiplexing unit 710 at this time, you can directly input the Y image component to the second processing unit 711, and then obtain the residual information of the Y image component; change the Y in the two image components 801 of the image block to be filtered
- the image component and the obtained residual information of the Y image component are input to the first adder 714, and
- the pre-processing filter and the post-processing filter in the embodiment of the present application may at least include an input fusion unit 708, a joint processing unit 709, and a first adder 714
- the second adder 715 and the third adder 716 can also include a first demultiplexing unit 702, a Y image component first processing unit 704, a U image component first processing unit 705, a V image component first processing unit 706, etc., It may even include a second branching unit 710, a Y image component second processing unit 711, a U image component second processing unit 712, a V image component second processing unit 713, etc., which are not specifically limited in the embodiment of the present application.
- the filtering method provided in the embodiments of the present application may adopt a split-merge-split processing structure, such as the filtering framework 70 shown in FIG. 7; or a less split-merge processing structure, such as The filtering framework 80 shown in 8; it is also possible to use fewer merge-split processing structures, or even fewer merge-split processing structures or more split-merge-split-merge-split
- the processing structure is not specifically limited in the embodiment of this application.
- first side information and the second side information can all participate in the filtering process, such as the filtering framework 70 shown in FIG. 7; the first side information and the second side information can also selectively participate in the filtering process.
- the filtering framework 80 shown in FIG. 8 in which the second side information does not participate in the filtering processing.
- all the first side information and the second side information participate in the filtering processing, or the first side information does not participate in the filtering processing, or the second side information does not participate in the filtering processing, or even Neither the first side information nor the second side information participates in the filtering process, which is not specifically limited in the embodiment of the present application.
- the fusion stage of the first side information and the second side information can be the same or different; that is, the first side information and the second side information can participate in the filtering process at the same stage , It can also participate in the filtering process at different stages, which is not specifically limited in the embodiment of the present application. For example, still taking the filtering framework 70 shown in FIG.
- both the first side information 703 and the second side information 707 can participate in the filtering process in the stage corresponding to the first branching unit 702, or the first side Both the information 703 and the second side information 707 can participate in the filtering process in the corresponding stage of the input fusion unit 708, or both the first side information 703 and the second side information 707 can be in the corresponding stage of the second branching unit 710.
- the first side information 703 participates in the filtering processing in the stage corresponding to the first branching unit 702, and the second side information 707 participates in the filtering processing in the stage corresponding to the input fusion unit 708
- the first side information 703 participates in the filtering process before the stage corresponding to the first branching unit 702
- the second side information 707 participates in the filtering process in the stage corresponding to the input fusion unit 708
- the first side The information 703 participates in the filtering process before the stage corresponding to the first branching unit 702
- the second side information 707 participates in the filtering process in the stage corresponding to the second branching unit 710
- the first side information 703 is input
- the stage corresponding to the fusion unit 708 participates in the filtering processing
- the second side information 707 participates in the filtering processing in the stage corresponding to the second branching unit 710; that is, the first side information 703 and the second side information 707
- the fusion stage can be
- the filtering framework 70 shown in FIG. 7 uses a deep learning network (such as CNN) for filtering.
- a deep learning network such as CNN
- the difference from traditional CNN filters is that the filters in the embodiments of the present application adopt a cascaded processing structure.
- the three components of the pixel information to be filtered can be simultaneously input into the filter network, and other coding-related side information (such as block division information, quantization parameter information, MV information and other coding parameters) can be incorporated, and these side information can be The same stage or different stages are integrated into the filter network; in this way, not only the relationship between the three components is fully utilized, but also other coding-related coding information is used to assist the filtering, which improves the filtering quality; in addition, for the three components Simultaneous processing also effectively avoids the problem of three complete network forward calculations for these three components, thereby reducing the computational complexity and saving the coding rate.
- coding-related side information such as block division information, quantization parameter information, MV information and other coding parameters
- FIG. 9 is a schematic structural diagram of another optional filtering framework provided by an embodiment of the present application.
- the filtering framework 90 may include three components of the pixel information to be filtered (represented by Y, U, and V, respectively). ) 901, first side information 902, Y image component first processing unit 903, U image component first processing unit 904, V image component first processing unit 905, second side information 906, fusion unit 907, joint processing unit 908 , Branching unit 909, Y image component second processing unit 910, U image component second processing unit 911, V image component second processing unit 912, first adder 913, second adder 914, third adder 915 And the three filtered image components (represented by Out_Y, Out_U, and Out_V, respectively) 916.
- the three components 901 of the pixel information to be filtered are subjected to component processing, and they are divided into three signals: Y image component, U image component, and V image component, the first Y image component and the corresponding
- the first side information 902 enters the Y image component first processing unit 903, the second U image component and the corresponding first side information 703 enter the U image component first processing unit 904, the third V image component and
- the corresponding first side information 902 enters the V image component first processing unit 905, which will output three new image components;
- the fusion unit 907 is used to fuse the three new image components and the second side information 906 , And then input to the joint processing unit 908;
- the joint processing unit 908 includes a multi-layer convolution filter network for convolution calculation on the input information, because the specific convolution calculation process is similar to related technical solutions, so for the joint processing The specific execution steps of unit 908 will not be described again.
- the joint processing unit 908 After the joint processing unit 908, it will enter the demultiplexing unit 909 to divide it into three signals again, and then input the three signals to the second processing unit 910 of Y image component and the second processing unit 911 of U image component.
- the second processing unit 912 of the V image component can sequentially obtain the residual information of the Y image component, the residual information of the U image component, and the residual information of the V image component; the Y in the three components 901 of the pixel information to be filtered
- the image component and the obtained residual information of the Y image component are input to the first adder 913, and the output of the first adder 913 is the filtered Y image component (denoted by Out_Y); the three components of the pixel information to be filtered
- the U image component in 901 and the obtained residual information of the U image component are jointly input to the second adder 914, and the output of the second adder 914 is the filtered U image component (denoted by Out_U);
- the filtering framework 90 may not include the demultiplexing unit 909, the second adder 914, and the third adder 915; if only the filtered U image needs to be output
- the filtering framework 90 may not include the demultiplexing unit 909, the first adder 913, and the third adder 915; if it is necessary to output the filtered Y image component and the filtered U image component, the filtering framework 90 may not include The third adder 915; the embodiment of the present application does not specifically limit it.
- the neural network architecture provided by the embodiments of the present application can reasonably and effectively utilize various components and side information, and can bring about better coding performance.
- a filtering device obtains pixel information to be filtered, obtains at least one type of side information, and inputs at least two components of the pixel information to be filtered and at least one type of side information to the neural network-based filtering
- at least one component of the pixel information to be filtered is obtained by outputting; that is, in the embodiment of the present application, at least two components of the pixel information to be filtered and at least one side information are obtained, and inputted
- the side information of at least one component is incorporated in the filtering process to obtain filtered pixel information.
- the relationship between multiple components is not only fully utilized, but also effective It avoids the need to perform multiple complete network forward calculations for at least two components, thereby reducing the computational complexity, saving the coding rate, and improving the image and post-processing filtering obtained after preprocessing and filtering in the coding and decoding process Then the quality of the image obtained, thereby improving the quality of the reconstructed image.
- FIG. 10 is a schematic structural diagram of an optional filtering device provided by an embodiment of the application.
- the filtering device may include: a first acquiring module 101, a second acquiring module 102, and Determine module 103, where
- the first obtaining module 101 is configured to obtain pixel information to be filtered
- the second obtaining module 102 is configured to obtain at least one type of side information
- the determining module 103 is configured to input at least two components of the pixel information to be filtered and at least one type of side information into the neural network-based filter to output at least one component of the pixel information to be filtered after filtering.
- the determining module 103 may include:
- the first processing submodule is configured to separately process each of the at least two components to obtain the processed at least two components
- the fusion sub-module is configured to perform fusion processing according to at least one type of side information and at least two processed components to obtain fusion information of the pixel information to be filtered;
- the second processing sub-module is configured to process the fusion information to obtain at least one component after filtering the pixel information to be filtered.
- the first processing sub-module may be specifically configured as:
- Component processing is performed on each of the at least two components to obtain at least two components after processing.
- the first processing submodule when the first side information corresponding to each component is acquired, correspondingly, the first processing submodule can be specifically configured as:
- the first side information includes at least block division information and/or quantization parameter information.
- the fusion sub-module can be specifically configured as:
- the second processing sub-module can be specifically configured as:
- Joint processing and branch processing are performed on the fusion information to obtain residual information corresponding to at least one of the at least two components;
- At least one component of the at least two components and residual information corresponding to the at least one component are summed to obtain at least one component after filtering the pixel information to be filtered.
- the first processing submodule when the second side information corresponding to each component is acquired, correspondingly, the first processing submodule can be specifically configured as:
- the second side information is different from the first side information.
- the fusion sub-module can be specifically configured as:
- the structure of the neural network includes at least a joint processing stage and an independent processing stage; in the joint processing stage, all components are processed together; in the independent processing stage, each component is processed on an independent branch of the neural network .
- a "unit" may be a part of a circuit, a part of a processor, a part of a program, or software, etc., of course, may also be a module, or may be non-modular.
- the various components in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be realized in the form of hardware or software function module.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of this embodiment is essentially or It is said that the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
- the computer software product is stored in a storage medium and includes several instructions to enable a computer device (which can A personal computer, server, or network device, etc.) or a processor (processor) executes all or part of the steps of the method described in this embodiment.
- the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
- FIG. 11 is a schematic structural diagram of an optional encoder provided by an embodiment of this application. As shown in FIG. 11, an embodiment of this application provides an encoder 1100.
- the storage medium 1102 includes a processor 1101 and a storage medium 1102 storing instructions executable by the processor 1101.
- the storage medium 1102 relies on the processor 1101 to perform operations through the communication bus 1103.
- the filtering method of the foregoing embodiment is executed.
- the communication bus 1103 is used to implement connection and communication between these components.
- the communication bus 1103 also includes a power bus, a control bus, and a status signal bus.
- various buses are marked as the communication bus 1103 in FIG. 11.
- An embodiment of the present application provides a computer storage medium that stores executable instructions.
- the processors execute the operations described in one or more embodiments above. Filtering method.
- the memory in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), and electrically available Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
- the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
- RAM static random access memory
- DRAM dynamic random access memory
- DRAM synchronous dynamic random access memory
- DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
- Enhanced SDRAM, ESDRAM Synchronous Link Dynamic Random Access Memory
- Synchlink DRAM Synchronous Link Dynamic Random Access Memory
- DRRAM Direct Rambus RAM
- the processor may be an integrated circuit chip with signal processing capabilities.
- the steps of the above method can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
- the aforementioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
- DSP Digital Signal Processor
- ASIC application specific integrated circuit
- FPGA ready-made programmable gate array
- the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers.
- the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
- the embodiments described herein can be implemented by hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Equipment (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), general-purpose processors, controllers, microcontrollers, microprocessors, and others for performing the functions described in this application Electronic unit or its combination.
- ASIC Application Specific Integrated Circuits
- DSP Digital Signal Processing
- DSP Device Digital Signal Processing Equipment
- PLD programmable Logic Device
- PLD Field-Programmable Gate Array
- FPGA Field-Programmable Gate Array
- the technology described herein can be implemented through modules (such as procedures, functions, etc.) that perform the functions described herein.
- the software codes can be stored in the memory and executed by the processor.
- the memory can be implemented in the processor or external to the processor.
- the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. ⁇
- the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes a number of instructions to enable a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the method described in each embodiment of the present application.
- the filtering device obtains pixel information to be filtered, obtains at least one type of side information, and inputs at least two components of the pixel information to be filtered and at least one type of side information into the neural network-based filter to
- the output obtains at least one component of the pixel information to be filtered after filtering; that is, in the embodiment of the present application, at least two components of the pixel information to be filtered and at least one type of side information are obtained, and inputted to the neural network-based
- the side information of at least one component is incorporated to obtain filtered pixel information.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (12)
- 一种滤波方法,其中,所述方法包括:获取待滤波像素信息;获取至少一种边信息;将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量。
- 根据权利要求1所述的方法,其中,将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量,包括:分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量;根据所述至少一种边信息和所述处理后的至少两种分量进行融合处理,得到所述待滤波像素信息的融合信息;对所述融合信息进行处理,得到所述待滤波像素信息滤波后的至少一种分量。
- 根据权利要求2所述的方法,其中,所述分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量,包括:分别对所述至少两种分量中每种分量进行分量处理,得到所述处理后的至少两种分量。
- 根据权利要求2所述的方法,其中,当获取到每种分量对应的第一边信息时,相应地,所述分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量,包括:分别将所述至少两种分量中每种分量与每种分量对应的第一边信息进行融合处理,得到所述处理后的至少两种分量;其中,所述第一边信息至少包括块划分信息和/或量化参数信息。
- 根据权利要求4所述的方法,其中,所述根据所述至少一种边信息和所述处理后的至少两种分量进行融合处理,得到所述待滤波像素信息的融合信息,包括:对所述处理后的至少两种分量与每种分量对应的第一边信息进行融合处理,得到所述待滤波像素信息的融合信息。
- 根据权利要求2所述的方法,其中,所述对所述融合信息进行处理,得到所述待滤波像素信息滤波后的至少一种分量,包括:对所述融合信息进行联合处理和分路处理,得到所述至少两种分量中的至少一种分量所对应的残差信息;将所述至少两种分量中的至少一种分量与所述至少一种分量所对应的残差信息进行求和运算,得到所述待滤波像素信息滤波后的至少一种分量。
- 根据权利要求1至6任一项所述的方法,其中,当获取到每种分量对应的第二边信息时,相应地,所述分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量,包括:分别将所述至少两个种分量中每种分量与每种分量对应的第二边信息进行融合处理,得到所述处理后的至少两种分量;其中,所述第二边信息与所述第一边信息不同。
- 根据权利要求7所述的方法,其中,所述根据所述至少一种边信息和所述处理后的至少两种分量进行融合处理,得到所述待滤波像素信息的融合信息,包括:对所述处理后的至少两种分量与每种分量对应的第二边信息进行融合处理,得到所 述待滤波像素信息的融合信息。
- 根据权利要求1所述的方法,其中,所述神经网络的结构中至少包括一个联合处理阶段和一个独立处理阶段;在所述联合处理阶段,所有分量共同处理;在所述独立处理阶段,每种分量在所述神经网络的一个独立分支上进行处理。
- 一种滤波装置,其中,所述滤波装置包括:第一获取模块,配置为获取待滤波像素信息;第二获取模块,配置为获取至少一种边信息;确定模块,配置为将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量。
- 一种编码器,其中,所述编码器包括:处理器以及存储有所述处理器可执行指令的存储介质,所述存储介质通过通信总线依赖所述处理器执行操作,当所述指令被所述处理器执行时,执行上述的权利要求1至9任一项所述的滤波方法。
- 一种计算机存储介质,其中,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行所述的权利要求1至9任一项所述的滤波方法。
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021556289A JP2022526107A (ja) | 2019-03-24 | 2019-09-05 | フィルタリング方法、装置、エンコーダ及びコンピュータ記憶媒体 |
| CN201980094255.XA CN113574884A (zh) | 2019-03-24 | 2019-09-05 | 滤波方法、装置、编码器以及计算机存储介质 |
| KR1020217032825A KR102916992B1 (ko) | 2019-03-24 | 2019-09-05 | 필터링 방법, 장치, 인코더 및 컴퓨터 저장 매체 |
| EP19922221.7A EP3941057A4 (en) | 2019-03-24 | 2019-09-05 | FILTERING METHOD AND APPARATUS, ENCODER AND COMPUTER STORAGE MEDIUM |
| US17/475,184 US12206904B2 (en) | 2019-03-24 | 2021-09-14 | Filtering method and device, encoder and computer storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962822951P | 2019-03-24 | 2019-03-24 | |
| US62/822,951 | 2019-03-24 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/475,184 Continuation US12206904B2 (en) | 2019-03-24 | 2021-09-14 | Filtering method and device, encoder and computer storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020192020A1 true WO2020192020A1 (zh) | 2020-10-01 |
Family
ID=72608393
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2019/104499 Ceased WO2020192020A1 (zh) | 2019-03-24 | 2019-09-05 | 滤波方法、装置、编码器以及计算机存储介质 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US12206904B2 (zh) |
| EP (1) | EP3941057A4 (zh) |
| JP (1) | JP2022526107A (zh) |
| KR (1) | KR102916992B1 (zh) |
| CN (1) | CN113574884A (zh) |
| WO (1) | WO2020192020A1 (zh) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115151941A (zh) * | 2020-12-29 | 2022-10-04 | 腾讯美国有限责任公司 | 用于视频编码的方法和设备 |
| JP2023100261A (ja) * | 2022-01-05 | 2023-07-18 | 株式会社アクセル | 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム、及び復号プログラム |
| WO2023190053A1 (ja) * | 2022-03-31 | 2023-10-05 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 画像符号化装置、画像復号装置、画像符号化方法、及び画像復号方法 |
| JP2023544705A (ja) * | 2020-10-05 | 2023-10-25 | クゥアルコム・インコーポレイテッド | ビデオコーディング中の、ジョイント成分ニューラルネットワークベースのフィルタ処理 |
| JP2023544711A (ja) * | 2020-10-06 | 2023-10-25 | インターデジタル ヴイシー ホールディングス フランス,エスエーエス | メタデータを用いた圧縮ビデオのループ内及びポストフィルタリングの空間解像度適合 |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11902561B2 (en) * | 2020-04-18 | 2024-02-13 | Alibaba Group Holding Limited | Convolutional-neutral-network based filter for video coding |
| EP4150907A1 (en) * | 2020-06-10 | 2023-03-22 | Huawei Technologies Co., Ltd. | Adaptive image enhancement using inter-channel correlation information |
| US12603998B2 (en) * | 2021-07-07 | 2026-04-14 | Lemon Inc. | Configurable neural network model depth in neural network-based video coding |
| CN117793355A (zh) * | 2022-09-19 | 2024-03-29 | 腾讯科技(深圳)有限公司 | 多媒体数据处理方法、装置、设备及存储介质 |
| US20260082084A1 (en) * | 2023-06-26 | 2026-03-19 | Lg Electronics Inc. | Image encoding/decoding method, method of transmitting bitstream and recording medium storing bitstream |
| CN119996677A (zh) * | 2023-11-09 | 2025-05-13 | 腾讯科技(深圳)有限公司 | 滤波方法、装置、电子设备以及存储介质 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108184129A (zh) * | 2017-12-11 | 2018-06-19 | 北京大学 | 一种视频编解码方法、装置及用于图像滤波的神经网络 |
| US10019814B2 (en) * | 2016-05-16 | 2018-07-10 | Canon Kabushiki Kaisha | Method, apparatus and system for determining a luma value |
| CN109120937A (zh) * | 2017-06-26 | 2019-01-01 | 杭州海康威视数字技术股份有限公司 | 一种视频编码方法、解码方法、装置及电子设备 |
| CN109151475A (zh) * | 2017-06-27 | 2019-01-04 | 杭州海康威视数字技术股份有限公司 | 一种视频编码方法、解码方法、装置及电子设备 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7260472B2 (ja) * | 2017-08-10 | 2023-04-18 | シャープ株式会社 | 画像フィルタ装置 |
| EP3451670A1 (en) * | 2017-08-28 | 2019-03-06 | Thomson Licensing | Method and apparatus for filtering with mode-aware deep learning |
-
2019
- 2019-09-05 EP EP19922221.7A patent/EP3941057A4/en not_active Withdrawn
- 2019-09-05 KR KR1020217032825A patent/KR102916992B1/ko active Active
- 2019-09-05 CN CN201980094255.XA patent/CN113574884A/zh active Pending
- 2019-09-05 WO PCT/CN2019/104499 patent/WO2020192020A1/zh not_active Ceased
- 2019-09-05 JP JP2021556289A patent/JP2022526107A/ja active Pending
-
2021
- 2021-09-14 US US17/475,184 patent/US12206904B2/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10019814B2 (en) * | 2016-05-16 | 2018-07-10 | Canon Kabushiki Kaisha | Method, apparatus and system for determining a luma value |
| CN109120937A (zh) * | 2017-06-26 | 2019-01-01 | 杭州海康威视数字技术股份有限公司 | 一种视频编码方法、解码方法、装置及电子设备 |
| CN109151475A (zh) * | 2017-06-27 | 2019-01-04 | 杭州海康威视数字技术股份有限公司 | 一种视频编码方法、解码方法、装置及电子设备 |
| CN108184129A (zh) * | 2017-12-11 | 2018-06-19 | 北京大学 | 一种视频编解码方法、装置及用于图像滤波的神经网络 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP3941057A4 * |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2023544705A (ja) * | 2020-10-05 | 2023-10-25 | クゥアルコム・インコーポレイテッド | ビデオコーディング中の、ジョイント成分ニューラルネットワークベースのフィルタ処理 |
| JP2023544711A (ja) * | 2020-10-06 | 2023-10-25 | インターデジタル ヴイシー ホールディングス フランス,エスエーエス | メタデータを用いた圧縮ビデオのループ内及びポストフィルタリングの空間解像度適合 |
| US12549747B2 (en) | 2020-10-06 | 2026-02-10 | Interdigital Ce Patent Holdings, Sas | Spatial resolution adaptation of in-loop and post-filtering of compressed video using metadata |
| CN115151941A (zh) * | 2020-12-29 | 2022-10-04 | 腾讯美国有限责任公司 | 用于视频编码的方法和设备 |
| CN115151941B (zh) * | 2020-12-29 | 2025-07-25 | 腾讯美国有限责任公司 | 视频处理的方法、计算机装置、设备及存储介质 |
| JP2023100261A (ja) * | 2022-01-05 | 2023-07-18 | 株式会社アクセル | 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム、及び復号プログラム |
| JP7742041B2 (ja) | 2022-01-05 | 2025-09-19 | 株式会社アクセル | 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム、及び復号プログラム |
| WO2023190053A1 (ja) * | 2022-03-31 | 2023-10-05 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 画像符号化装置、画像復号装置、画像符号化方法、及び画像復号方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20210139342A (ko) | 2021-11-22 |
| US20220021905A1 (en) | 2022-01-20 |
| EP3941057A1 (en) | 2022-01-19 |
| EP3941057A4 (en) | 2022-06-01 |
| CN113574884A (zh) | 2021-10-29 |
| US12206904B2 (en) | 2025-01-21 |
| JP2022526107A (ja) | 2022-05-23 |
| KR102916992B1 (ko) | 2026-01-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020192020A1 (zh) | 滤波方法、装置、编码器以及计算机存储介质 | |
| US12177491B2 (en) | Loop filter implementation method and apparatus, and computer storage medium | |
| CN113747179B (zh) | 环路滤波实现方法、装置及计算机存储介质 | |
| WO2020192034A1 (zh) | 滤波方法及装置、计算机存储介质 | |
| CN113784128B (zh) | 图像预测方法、编码器、解码器以及存储介质 | |
| CN113766233B (zh) | 图像预测方法、编码器、解码器以及存储介质 | |
| WO2021203381A1 (zh) | 一种视频编解码方法、装置以及计算机可读存储介质 | |
| WO2025129410A1 (zh) | 编解码方法、码流、编码器、解码器以及存储介质 | |
| WO2025097423A1 (zh) | 编解码方法、码流、编码器、解码器以及存储介质 | |
| WO2025138170A1 (zh) | 编解码方法、编解码器以及存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19922221 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021556289 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 20217032825 Country of ref document: KR Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2019922221 Country of ref document: EP Effective date: 20211015 |
|
| WWD | Wipo information: divisional of initial pct application |
Ref document number: 202528020913 Country of ref document: IN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202528020913 Country of ref document: IN |
|
| WWW | Wipo information: withdrawn in national office |
Ref document number: 2019922221 Country of ref document: EP |
