WO2024055458A1 - 图像降噪处理方法、装置、设备、存储介质和程序产品 - Google Patents

图像降噪处理方法、装置、设备、存储介质和程序产品 Download PDF

Info

Publication number
WO2024055458A1
WO2024055458A1 PCT/CN2022/138842 CN2022138842W WO2024055458A1 WO 2024055458 A1 WO2024055458 A1 WO 2024055458A1 CN 2022138842 W CN2022138842 W CN 2022138842W WO 2024055458 A1 WO2024055458 A1 WO 2024055458A1
Authority
WO
WIPO (PCT)
Prior art keywords
sampling
module
upsampling
feature data
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/138842
Other languages
English (en)
French (fr)
Inventor
潘叶峰
胡胜发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Anyka Microelectronics Co Ltd
Original Assignee
Guangzhou Anyka Microelectronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Anyka Microelectronics Co Ltd filed Critical Guangzhou Anyka Microelectronics Co Ltd
Priority to EP22958645.8A priority Critical patent/EP4535279A4/en
Priority to JP2024569570A priority patent/JP7826519B2/ja
Priority to US18/992,375 priority patent/US20250390989A1/en
Publication of WO2024055458A1 publication Critical patent/WO2024055458A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the field of image processing technology, and in particular to an image noise reduction processing method, device, equipment, storage medium and program product.
  • Image noise reduction technology is a key work in the field of image processing.
  • the ISP (Image Signal Processor) chip is mainly used for image processing of real-time video images captured by the terminal. In terms of image noise reduction processing, it requires high real-time performance of the image noise reduction algorithm.
  • this application provides an image noise reduction processing method.
  • the method includes:
  • the target image data includes the pixel values of each channel of the target image; wherein, the image noise reduction model includes cascaded A downsampling model, an upsampling model and an output layer.
  • the downsampling model includes n cascaded downsampling modules, and the upsampling model includes n cascaded upsampling modules that correspond to n downsampling modules one-to-one;
  • the down-sampling module includes a first down-sampling module, a second down-sampling module and a fusion module cascaded with both the first down-sampling module and the second down-sampling module;
  • the first down-sampling module includes a cascaded first A downsampling layer and a first convolutional layer
  • the second downsampling module includes a second downsampling layer.
  • inputting the target image data into the image denoising model to obtain the denoising image data output by the image denoising model includes: inputting the target image data into the downsampling model, and obtaining the denoising image data from the downsampling model.
  • Each down-sampling module in the down-sampling model performs down-sampling processing on the target image data to obtain down-sampling feature data; the down-sampling feature data is input into the up-sampling model, and each up-sampling module in the up-sampling model The down-sampled feature data is subjected to up-sampling processing to obtain up-sampled feature data; the output layer obtains the noise-reduced image data based on the up-sampled feature data and the target image data.
  • the input data of the i down-sampling module is the target image data.
  • the input data of the i-th down-sampling module is the intermediate down-sampling feature data output by the i-1 down-sampling module;
  • the last intermediate down-sampling feature data output by the down-sampling module is used as the down-sampling feature data.
  • obtaining the noise reduction image data from the output layer based on the upsampling feature data and the target image data includes: inputting the upsampling feature data and the target image data to the output layer for fusion. Process to obtain the noise-reduced image data output by the output layer.
  • the image data resolution of each channel of the target image is different
  • the down-sampling model also includes an additional down-sampling module, which inputs the target image data into the down-sampling model, and uses the down-sampling module in the down-sampling model to Each downsampling module performs downsampling processing on the target image data to obtain downsampled feature data, including:
  • the input data of the module is the intermediate down-sampling feature data output by the i-1th down-sampling module; the intermediate down-sampling feature data output by the last down-sampling module is used as the down-sampling feature data.
  • the upsampling model further includes an additional upsampling module that inputs the downsampled feature data into the upsampling model, and each upsampling module in the upsampling model performs the downsampling feature data Perform upsampling processing to obtain upsampled feature data, including:
  • i is greater than 1, the input data of the i-th upsampling module is the intermediate up-sampling feature output by the i-1 upsampling module.
  • the first intermediate channel feature data corresponding to the channel pixel value is input into the additional upsampling module, and the upsampling feature data output by the additional upsampling module is obtained.
  • the noise reduction image data is obtained by the output layer based on the upsampling feature data and the target image data, including:
  • the upsampling feature data and the first channel pixel value in the target image data are input to the output layer for fusion processing to obtain candidate noise reduction image data output by the output layer; according to the candidate noise reduction image data and the The second intermediate channel feature data corresponding to the second channel pixel value included in the intermediate upsampling feature data output by the last upsampling module obtains the noise reduction image data.
  • the input data of the i-th down-sampling module is down-sampled to obtain the intermediate down-sampling feature data output by the i-th down-sampling module, including:
  • the first down-sampling layer is used to perform down-sampling processing on the input data of the i-th down-sampling module to obtain the first down-sampling feature data output by the first down-sampling layer; the first convolution layer is used to perform down-sampling processing on the first down-sampling module.
  • the down-sampling feature data is subjected to convolution processing to obtain the first convolution feature data output by the first convolution layer; the input data of the i-th down-sampling module is down-sampled using the second down-sampling layer to obtain the The second down-sampling feature data output by the second down-sampling layer; use the fusion module to fuse the first convolution feature data and the second down-sampling feature data to obtain the intermediate down-sampling feature data output by the fusion module .
  • the upsampling module includes a cascaded second convolution layer and an upsampling layer; the input data of the i-th upsampling module is upsampled to obtain the i-th upsampling module.
  • the output intermediate upsampled feature data includes:
  • the feature data is subjected to upsampling processing to obtain the intermediate upsampling feature data output by the upsampling layer.
  • the image noise reduction model is used in the RAW image noise reduction module, RGB image noise reduction module or YUV image noise reduction module in the ISP chip; correspondingly, the format of the target image is RAW format, RGB format or YUV format.
  • the upsampling layer performs upsampling processing on the input data of the upsampling layer through convolution processing, unpooling processing or interpolation processing.
  • this application also provides an image noise reduction processing device.
  • the device includes:
  • the denoising module is used to input the target image data into the image denoising model and obtain the denoising image data output by the image denoising model.
  • the target image data includes the pixel values of each channel of the target image;
  • the image denoising model includes a cascaded downsampling model, an upsampling model and an output layer.
  • the downsampling model includes n cascaded downsampling modules
  • the upsampling model includes n downsampling modules one by one.
  • the The first downsampling module includes a cascaded first downsampling layer and a first convolutional layer, and the second downsampling module includes a second downsampling layer.
  • the noise reduction module is specifically used for:
  • the target image data is input into the down-sampling model, and each down-sampling module in the down-sampling model performs down-sampling processing on the target image data to obtain down-sampled feature data; the down-sampled feature data is input into the up-sampling model.
  • each upsampling module in the upsampling model performs upsampling processing on the downsampled feature data to obtain upsampled feature data;
  • the output layer obtains the downsampled feature data based on the upsampled feature data and the target image data. noisy image data.
  • the image data resolution of each channel of the target image is the same, and the noise reduction module is specifically used to:
  • the input data of the i-th down-sampling module is the intermediate down-sampling feature data output by the i-1 down-sampling module. ; Use the last intermediate down-sampling feature data output by the down-sampling module as the down-sampling feature data.
  • the noise reduction module is specifically used for:
  • the input data of the i-th upsampling module is the intermediate upsampling feature output by the i-1 upsampling module.
  • the noise reduction module is specifically used for:
  • the upsampled feature data and the target image data are input to the output layer for fusion processing, and the noise-reduced image data output by the output layer is obtained.
  • the image data resolution of each channel of the target image is different.
  • the downsampling model also includes an additional downsampling module.
  • the noise reduction module is specifically used for:
  • the input data of the module is the intermediate down-sampling feature data output by the i-1th down-sampling module; the intermediate down-sampling feature data output by the last down-sampling module is used as the down-sampling feature data.
  • the upsampling model also includes an additional upsampling module, and the noise reduction module is specifically used for:
  • the input data of the i-th upsampling module is the intermediate upsampling feature output by the i-1 upsampling module.
  • the first intermediate channel feature data corresponding to the channel pixel value is input into the additional upsampling module, and the upsampling feature data output by the additional upsampling module is obtained.
  • the noise reduction module is specifically used for:
  • the upsampling feature data and the first channel pixel value in the target image data are input to the output layer for fusion processing to obtain candidate noise reduction image data output by the output layer; according to the candidate noise reduction image data and the The second intermediate channel feature data corresponding to the second channel pixel value included in the intermediate upsampling feature data output by the last upsampling module obtains the noise reduction image data.
  • the noise reduction module is specifically used for:
  • the first down-sampling layer is used to perform down-sampling processing on the input data of the i-th down-sampling module to obtain the first down-sampling feature data output by the first down-sampling layer; the first convolution layer is used to perform down-sampling processing on the first down-sampling module.
  • the down-sampling feature data is subjected to convolution processing to obtain the first convolution feature data output by the first convolution layer; the input data of the i-th down-sampling module is down-sampled using the second down-sampling layer to obtain the The second down-sampling feature data output by the second down-sampling layer; use the fusion module to fuse the first convolution feature data and the second down-sampling feature data to obtain the intermediate down-sampling feature data output by the fusion module .
  • the upsampling module includes a cascaded second convolution layer and an upsampling layer; the noise reduction module is specifically used for:
  • the feature data is subjected to upsampling processing to obtain the intermediate upsampling feature data output by the upsampling layer.
  • the image noise reduction model is used in the RAW image noise reduction module, RGB image noise reduction module or YUV image noise reduction module in the ISP chip; correspondingly, the format of the target image is RAW format, RGB format or YUV format.
  • the upsampling layer performs upsampling processing on the input data of the upsampling layer through convolution processing, unpooling processing or interpolation processing.
  • the present application also provides an electronic device, including a memory and a processor.
  • the memory stores a computer program.
  • the processor executes the computer program, it implements the steps of the method described in any one of the above first aspects.
  • the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a computer program is stored on which a computer program is stored.
  • the steps of the method described in any one of the above-mentioned first aspects are implemented.
  • the present application also provides a computer program product, which includes a computer program that, when executed by a processor, implements the steps of the method described in any one of the above first aspects.
  • the above-mentioned image noise reduction processing methods, devices, equipment, storage media and program products can directly input the target image data including the pixel values of each channel of the target image into the image noise reduction model, and the image noise reduction model output can be obtained denoising image data to achieve denoising of the target image.
  • noise reduction processing needs to be performed separately based on the pixel value of the Y channel and the pixel value of the UV channel of the YUV image to obtain the image data after noise reduction processing. Since the Y channel and UV channel are processed at the same time, it needs to be repeated Calling image data has poor processing efficiency and cannot meet the real-time requirements of the ISP chip.
  • the target image data including the pixel values of each channel of the target image can be directly input into the image noise reduction model for reduction.
  • Noise processing that is, denoising the data of each channel of the target image at the same time.
  • the amount of data and calculation is greatly reduced, which can effectively improve the data processing efficiency and meet the real-time requirements. requirements; and, during the noise reduction process, the information between each channel in the target image can be referenced with each other to achieve better noise reduction processing effects.
  • the network structure of the image denoising model is simplified, which includes a cascaded down-sampling model, an up-sampling model and an output layer.
  • the down-sampling model includes n cascaded down-sampling modules, and the up-sampling model includes n
  • the down-sampling module corresponds to n cascaded up-sampling modules one-to-one; the down-sampling module includes a first down-sampling module, a second down-sampling module, and an equal level with the first down-sampling module and the second down-sampling module.
  • a connected fusion module the first down-sampling module includes a cascaded first down-sampling layer and a first convolution layer, and the second down-sampling module includes a second down-sampling layer, which can be achieved through a simplified image denoising model.
  • the effective noise reduction processing of the target image data makes the image noise reduction model fully adaptable to the ISP chip for denoising real-time video images.
  • Figure 1 is a schematic structural diagram of an image denoising model in one embodiment
  • Figure 2 is a schematic structural diagram of a multi-convolution parallel module in an embodiment
  • Figure 3 is a schematic diagram of the image processing flow of a traditional ISP chip in one embodiment
  • Figure 4 is a schematic diagram of the image processing flow of the first improved ISP chip in one embodiment
  • Figure 5 is a schematic diagram of the image processing flow of the second improved ISP chip in one embodiment
  • Figure 6 is a schematic flowchart of noise reduction processing in an embodiment
  • Figure 7 is a schematic structural diagram of a noise reduction neural network in one embodiment
  • Figure 8 is a schematic structural diagram of another image noise reduction model in one embodiment
  • Figure 9 is a schematic structural diagram of another noise reduction neural network in one embodiment.
  • Figure 10 is a schematic diagram of element-wise fusion processing in one embodiment
  • Figure 11 is a schematic diagram of the fusion process of channel splicing in one embodiment
  • Figure 12 is a structural block diagram of an image noise reduction processing device in one embodiment
  • Figure 13 is an internal structure diagram of a computer device in one embodiment.
  • Image denoising is a type of image restoration technology, which aims to accurately find the signal value or noise value in an image, or to separate the signal part from the noise part in an image.
  • Image noise reduction algorithms are currently mainly divided into traditional algorithms and algorithms based on neural networks.
  • the current basic situation is that the noise reduction effect of traditional algorithms is poor and cannot meet the needs of noise reduction effects; while the neural network algorithm has a very large calculation amount and is not friendly to existing chips and cannot meet the real-time needs of ISP chips.
  • traditional image noise reduction can be divided into spatial domain noise reduction, frequency domain noise reduction, and spatial-frequency domain combined noise reduction according to the characteristic space of separated signal noise; according to the image range used in the noise reduction process, it can be divided into local noise reduction and non-local noise reduction.
  • Specific traditional noise reduction methods include mean filtering, median filtering, Gaussian filtering, bilateral filtering, non-local mean filtering, guided filtering, discrete cosine domain filtering, wavelet transform domain filtering, etc.
  • Traditional noise reduction methods are based on the simple assumption of statistical differences in signal and noise characteristics, and use a fixed set of methods to separate signals and noise.
  • embodiments of the present application provide an image denoising processing method that satisfies the real-time requirements of the ISP chip and can better denoise video images.
  • an image noise reduction processing method is provided.
  • This embodiment of the present application illustrates the application of this method to a terminal including an ISP chip. It is understandable that this method can also be applied to servers, and can also It is applied to systems including terminals and servers, and is implemented through the interaction between terminals and servers. Specifically, the execution subject of this method may be the ISP chip in the terminal.
  • the terminal can be, but is not limited to, various computer equipment or photography equipment, etc.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • the method includes: inputting target image data into the image denoising model to obtain denoising image data output by the image denoising model, where the target image data includes pixel values of each channel of the target image.
  • the image denoising model includes a cascaded downsampling model, an upsampling model and an output layer.
  • the model includes n cascaded down-sampling modules, and the up-sampling model includes n cascaded up-sampling modules that correspond to the n down-sampling modules one-to-one;
  • the down-sampling module includes a first down-sampling module, a second down-sampling module, and A fusion module cascaded with both the first downsampling module and the second downsampling module;
  • the first downsampling module includes a cascaded first downsampling layer and a first convolutional layer, and the second downsampling module includes a second downsampling layer.
  • the ISP (Image Signal Processor) chip is used to obtain the image captured by the image sensor at the front end of the terminal, perform a series of image processing and output the processed image.
  • the steps for the ISP chip to process images in RAW format from the image sensor include: dead pixel correction, dark current correction, lens shading correction, RAW image noise reduction, white balance,
  • the RGB image is obtained through color interpolation, etc., and then through Gamma correction (gamma correction), color correction, RGB image conversion to YUV image, etc., the YUV image is obtained.
  • the YUV image is processed by noise reduction, edge enhancement, brightness/contrast/hue/saturation adjustment, etc.
  • the image data is encoded to obtain the final output video image.
  • the image processed by the ISP chip can be a single image or a video image composed of consecutive frames.
  • Various image processing algorithms can be integrated into the ISP chip to implement the above image processing steps of the ISP chip.
  • the image noise reduction model is an algorithm applied in the ISP chip to implement the image noise reduction processing steps.
  • FIG. 1 only uses three down-sampling modules and three up-sampling modules as an example, and is not used to limit this application.
  • the target image is an image that needs to be processed for noise reduction in the ISP chip, and the target image data is the pixel value of each channel of the target image.
  • the target image sensor acquires a single image
  • the target image The image is a single image
  • the front-end image processor obtains a real-time video image
  • the target image is a single-frame image in the real-time video image.
  • the target image may be in RAW format, RGB format, YUV format, etc., which is not specifically limited in the embodiments of the present application.
  • the image noise reduction model can perform different processing according to whether the resolution of the image data of each channel of the processed target image is consistent. Specifically, for target images with consistent resolution of image data in each channel, the image denoising model has a single input and output, and can directly perform denoising processing on the input target image data; for targets with inconsistent resolution of image data in each channel, Image, the image denoising model is multi-input and output, and the image data of different channels in the target image data can be input into the image denoising model for noise reduction processing. As a result, the image noise reduction model can be adapted to perform noise reduction processing on various types of target images.
  • image denoising is divided into single image denoising and multi-frame image joint denoising.
  • single-image denoising is more suitable for real-time processing scenarios than joint denoising of multi-frame images.
  • the embodiment of this application combines the U-Net network to optimize the network structure and computing unit, and proposes a very streamlined single-graph noise reduction network structure. , that is, the image denoising model.
  • the main body of the image denoising model is the network.
  • the main structure is the U-Net network.
  • the downsampling model of the image denoising model is used to downsample the target image data
  • the upsampling model is used to upsample the feature data obtained after upsampling
  • the output layer is used to output data based on the upsampling model.
  • the target image data outputs the noise reduction image data corresponding to the target image.
  • the input of each upsampling module is the output data of the previous upsampling module and the output data of the downsampling module corresponding to the upsampling module.
  • the down-sampling model consists of n cascaded down-sampling modules, and each down-sampling module is a multi-convolution parallel module used to extract more image features.
  • each down-sampling module is a multi-convolution parallel module used to extract more image features.
  • Figure 2 shows a schematic structural diagram of a multi-convolution parallel module provided by an embodiment of the present application.
  • the multi-convolution parallel module It includes two downsampling layers 201, a convolution layer 202 and a fusion layer 203.
  • the multi-convolution parallel module 200 may also include other numbers of down-sampling layers and convolution layers, which are not specifically limited in the embodiments of this application.
  • the down-sampling module includes the first volume
  • other numbers of convolution layers and down-sampling layers may also be included. The details may be determined based on parameters such as the computing power and bandwidth of the ISP chip itself.
  • the embodiments of the present application There is no specific limit on this.
  • the image denoising model includes the downsampling module of the structure provided in the embodiment of the present application to achieve a better downsampling effect, and while ensuring the denoising effect, it can also meet the real-time requirements of the ISP chip. .
  • the number of down-sampling modules and up-sampling modules in the image denoising model can be determined based on parameters such as the computing power and bandwidth of the ISP chip itself, which is not specifically limited in the embodiments of this application.
  • the channel fusion method of the fusion module in the downsampling module can be element-wise addition or channel splicing, which is not specifically limited in the embodiments of this application.
  • the above image denoising processing method can directly input the target image data including the pixel values of each channel of the target image into the image denoising model, and then obtain the denoising image data output by the image denoising model, thereby realizing the target image noise reduction processing.
  • noise reduction processing needs to be performed separately based on the pixel value of the Y channel and the pixel value of the UV channel of the YUV image to obtain the image data after noise reduction processing. Since the Y channel and UV channel are processed at the same time, it needs to be repeated Calling image data has poor processing efficiency and cannot meet the real-time requirements of the ISP chip.
  • the target image data including the pixel values of each channel of the target image can be directly input into the image noise reduction model for reduction.
  • Noise processing that is, denoising the data of each channel of the target image at the same time.
  • the amount of data and calculation is greatly reduced, which can effectively improve the data processing efficiency and meet the real-time requirements. requirements; and, during the noise reduction process, the information between each channel in the target image can be referenced with each other to achieve better noise reduction processing effects.
  • the network structure of the image denoising model is simplified, which includes a cascaded down-sampling model, an up-sampling model and an output layer.
  • the down-sampling model includes n cascaded down-sampling modules, and the up-sampling model includes n
  • the down-sampling module corresponds to n cascaded up-sampling modules one-to-one; the down-sampling module includes a first down-sampling module, a second down-sampling module, and an equal level with the first down-sampling module and the second down-sampling module.
  • a connected fusion module the first down-sampling module includes a first convolution layer and a first down-sampling layer, and the second down-sampling module includes a second down-sampling layer.
  • the target image can be processed through a simplified image denoising model.
  • the effective noise reduction processing of data makes this image noise reduction model fully adaptable to the ISP chip for denoising real-time video images.
  • the image noise reduction model is used in the RAW image noise reduction module, RGB image noise reduction module or YUV image noise reduction module in the ISP chip; correspondingly, the format of the target image is RAW format, RGB format or YUV Format.
  • FIG. 3 shows a schematic diagram of the image processing flow of a traditional ISP chip provided by an embodiment of the present application.
  • the processing process of the traditional ISP chip includes: obtaining the RAW image corresponding to each frame in the video image transmitted by the front-end image sensor, After dead pixel correction, dark current correction, lens shading correction, RAW image noise reduction, white balance, color interpolation, etc., the RGB image is obtained, and then through Gamma correction, color correction, RGB conversion to YUV, etc., the YUV image is obtained.
  • the Y (brightness) of the YUV image ) channel data is subjected to noise reduction processing and edge enhancement and brightness/contrast adjustment are performed.
  • the UV (color) channel data of the YUV image is subjected to hue/saturation adjustment to achieve noise reduction processing of the YUV image, and then the YUV after noise reduction processing is
  • the image is digitally coded to obtain the final output video image.
  • RAW images are image sensor acquisition formats, which are essentially a special RGB format.
  • the RAW image undergoes a series of processing and then undergoes color interpolation to obtain an ordinary RGB image, which is then converted to a YUV image after a series of processing.
  • the YUV image format is a format that separates the brightness and color of the image, where the brightness is represented by the Y channel and the color is represented by the UV channels.
  • the Y channel and UV channel will be processed separately.
  • the Y (brightness) channel and UV (color) channel will use noise reduction algorithms respectively, and then perform edge enhancement, brightness and contrast adjustment on the brightness component, and do the color component Hue and saturation adjustments.
  • the image noise reduction model is applied to the ISP chip, and can be directly used to perform noise reduction processing on RGB images, RAW images, or YUV images.
  • the format of the corresponding target image is RAW format
  • the format of the corresponding target image is RGB format
  • the format of the corresponding target image is YUV format
  • Figure 4 shows a schematic diagram of the first improved ISP chip image processing flow provided by the embodiment of the present application.
  • This image noise reduction model is applied to the YUV image noise reduction module in the ISP chip. , to directly perform denoising processing on the data of each channel of the YUV image.
  • the target image data input to the image denoising model is the pixel value of each channel of the YUV image.
  • FIG. 5 shows a schematic diagram of the second improved ISP chip image processing flow provided by the embodiment of the present application.
  • This image noise reduction model is applied to RGB image noise reduction in the ISP chip.
  • the data of each channel of the RGB image is denoised, and then the ISP chip can directly convert the RGB image obtained by the denoising process into a YUV image and perform subsequent processing, without the need to reduce the YUV image by channel again.
  • the target image data input to the image noise reduction model is the pixel value of each channel of the RGB image.
  • the image noise reduction model can also be applied to the RAW image noise reduction module in the ISP chip to directly perform noise reduction processing on RAW images.
  • the target image data input to the image noise reduction model is the pixel value of each channel of the RAW image.
  • the noise reduction module in the traditional ISP chip is slightly adjusted, and the brightness noise reduction and color noise reduction are combined to achieve the overall noise reduction process through the image noise reduction model, which can improve the extraction of noise during the noise reduction process.
  • YUV images while denoising each channel (Y, U, V) of the image, you can refer to the information between each channel to achieve a better noise reduction effect; in addition, merge channels to process the actual calculation amount and data reading. The amount of writing is less than the sum of separate channel processing. Therefore, merging image channels has positive benefits in terms of performance and effect. Therefore, the image noise reduction model is applied to the ISP chip for image reduction. Noise processing, both in terms of noise reduction effect and real-time performance, meets the requirements of ISP chips.
  • FIG. 6 shows a schematic flow chart of a noise reduction process provided by an embodiment of the present application; input the target image data into the image noise reduction model, and obtain the image noise reduction model output Denoised image data, including:
  • Step 601 Input the target image data into the down-sampling model, and perform down-sampling processing on the target image data using each down-sampling module in the down-sampling model to obtain down-sampling feature data.
  • Step 602 Input the down-sampled feature data into the up-sampling model, and each up-sampling module in the up-sampling model performs up-sampling processing on the down-sampled feature data to obtain the up-sampled feature data.
  • Step 603 The output layer obtains noise-reduced image data based on the upsampled feature data and target image data.
  • the down-sampling model consists of n cascaded down-sampling modules.
  • the value of n can be determined based on the actual operation and storage conditions of the chip, thereby determining the structure of the down-sampling model and the previous sampling model.
  • the application embodiment does not specifically limit the number of up-sampling modules and down-sampling modules.
  • the number of up-sampling modules should be consistent with the number of down-sampling modules.
  • Each downsampling module is used to downsample the input data to extract more image features. After the down-sampling process, the image is reduced.
  • the up-sampling module is used to perform up-sampling processing to restore the image size.
  • the last module among n cascaded down-sampling modules outputs downsampled feature data
  • the last module among n cascaded up-sampling modules outputs upsampled feature data
  • the upsampling feature data and the target image data are input to the output layer for fusion processing, and the noise reduction image data can be obtained based on the data output by the output layer.
  • the application simplifies the network structure and constructs a lightweight neural network model suitable for chip operation to achieve image denoising. model, the single image denoising algorithm can be applied to real-time denoising of ISP chips, while meeting the requirements of image denoising effect and real-time algorithm, and solving the problem of neural network deployment and real-time operation on the chip.
  • the noise reduction effect has been significantly improved.
  • the process of inputting it into the image denoising model for processing is also different.
  • the following describes the process of processing the target image data corresponding to the two types of target images respectively. .
  • the target image is an image in a format with the same image data resolution of each channel, such as a target image in RAW format, RGB format, or YUV444 format.
  • each downsampling module in the downsampling model The target image data is down-sampled to obtain down-sampled feature data, including:
  • For the i-th down-sampling module perform down-sampling processing on the input data of the i-th down-sampling module to obtain the intermediate down-sampling feature data output by the i-th down-sampling module; convert the intermediate down-sampling feature output from the last down-sampling module data as downsampled feature data.
  • the input data of the i-th downsampling module is the target image data; when i is greater than 1, the input data of the i-th downsampling module is the i-1th downsampling module.
  • images are stored in different data formats (such as RAW, RGB, YUV444, YUV420) in different modules. If the input and output of the image noise reduction model are formats with the same resolution of each channel such as RAW, RGB, YUV444, etc., for the first downsampling module, the entire target image data is used as the input of the first downsampling module, and for other The down-sampling module uses the intermediate down-sampling feature data output by the previous down-sampling module as the input data of the down-sampling module. Each downsampling module is used to extract image features to obtain intermediate downsampling feature data.
  • each upsampling module in the upsampling model performs upsampling processing on the downsampled feature data to obtain the upsampled feature data, including:
  • the input data of the i-th upsampling module is upsampled to obtain the intermediate upsampling feature data output by the i-th upsampling module; the intermediate upsampling feature output by the last upsampling module is data as upsampled feature data.
  • the input data of the i-th upsampling module is downsampling feature data.
  • the input data of the i-th upsampling module is the i-1th upsampling Aggregated feature data obtained by fusion processing of the intermediate upsampling feature data output by the module and the intermediate downsampling feature data output by the downsampling module corresponding to the i-th upsampling module.
  • the upsampled feature data is used as the input of the first upsampling module.
  • the intermediate upsampling feature data output by the previous upsampling module and the intermediate downsampling feature data output by the downsampling module corresponding to the upsampling module are fused to obtain aggregated feature data, and the aggregated feature data is
  • the input data of the upsampling module the deep features, shallow features, and features of different resolutions in the target image data can be fully integrated to improve the effect of noise reduction processing.
  • the output layer obtains the noise reduction image data based on the upsampling feature data and the target image data, including: inputting the upsampling feature data and the target image data to the output layer for fusion processing to obtain the output layer output. noise reduction image data.
  • the output layer is mainly used for fusion processing of input data.
  • both the output layer and the fusion module in the downsampling module are used for feature fusion processing.
  • the output layer can perform simple element-by-element addition processing or channel-by-channel superposition processing on the input data to achieve fusion processing.
  • the upsampling feature data output by the last upsampling module contains the noise feature data of the target image.
  • the target image data and the upsampling feature data are fused to remove the noise features from the target image data and output
  • the output of the layer is the denoised image data with the noise feature data removed to achieve denoising processing of the target image. That is, the entire image denoising model actually outputs the noise residual, and then the noise residual is used in the output layer. After superimposing it with the target image, the denoised image output by the output layer is obtained.
  • the upsampling module may be composed of cascaded convolutional layers and upsampling layers.
  • FIG. 7 shows a schematic structural diagram of a noise reduction neural network provided by an embodiment of the present application.
  • the image denoising model shown in Figure 7 includes three down-sampling modules and three up-sampling modules.
  • the fusion modules and other fusion processes in the output layer and down-sampling module are all based on element-wise fusion.
  • the downsampling layer and the upsampling layer can achieve downsampling or upsampling through convolution processing.
  • the image denoising model based on the U-net network structure provided by the embodiment of the present application has a simple structure and directly performs denoising processing on the data of each channel of the image as a whole, ensuring a good denoising effect and high processing efficiency. .
  • the number of down-sampling modules and up-sampling modules can be modified according to actual computing power and bandwidth limitations, which can appropriately improve the noise reduction effect.
  • three times of down-sampling and three times of up-sampling are taken as an example. In fact, the number of times can be increased (such as 4 times, 5 times, etc.) or reduced (such as 2 times).
  • the channel fusion method can be selected by element addition or channel splicing.
  • the target image is an image in a format with different image data resolutions for each channel, such as a target image in a format such as YUV420.
  • the downsampling model in the image denoising model also includes additional downsampling. Sampling module.
  • the upsampling model in this image denoising model also includes an additional upsampling module. Please refer to FIG. 8 , which shows a schematic structural diagram of another image noise reduction model provided by an embodiment of the present application.
  • the target image data is input into the downsampling model, and each downsampling module in the downsampling model downsamples the target image data to obtain downsampled feature data, including:
  • the first channel pixel value of the target image contained in the target image data is input to the additional downsampling module to obtain channel feature data output by the additional downsampling module.
  • the channel feature data is fused with the second channel pixel value of the target image contained in the target image data, and the candidate target image data is obtained as input data of the first downsampling module.
  • the input data of the i-th down-sampling module is down-sampled to obtain the intermediate down-sampling feature data output by the i-th down-sampling module.
  • the intermediate downsampling feature data output by the last downsampling module is used as the downsampling feature data.
  • the input data of the i-th downsampling module is the candidate target image data; when i is greater than 1, the input data of the i-th downsampling module is the i-1th downsampling Intermediate downsampled feature data output by the module.
  • images are stored in different data formats (such as RAW, RGB, YUV444, YUV420) in different modules. If the image denoising model is embedded in the chip, the input and output image format is YUV420, etc. The resolution of each channel is inconsistent. When the position is , you need to use a multiple-input multiple-output network structure, which is the structure in Figure 8.
  • the target image data corresponding to the target image consists of the first channel pixel value and the second channel pixel value.
  • the first channel pixel value is: is the Y channel pixel value
  • the second channel pixel value is the UV channel pixel value. Due to different resolutions, it is necessary to use an additional downsampling module to pre-downsample the first channel pixel value, that is, the Y channel pixel value.
  • the channel feature data output by the additional downsampling module can be compared with the second channel pixel value. That is, the UV channel pixel values have the same resolution.
  • the channel feature data and the second channel pixel value can be directly fused and each downsampling module can be used to perform normal downsampling processing.
  • the down-sampled feature data is input into the up-sampling model, and each up-sampling module in the up-sampling model performs up-sampling processing on the down-sampled feature data to obtain up-sampled feature data, including:
  • the input data of the i-th upsampling module is upsampled to obtain the intermediate upsampling feature data output by the i-th upsampling module.
  • the first intermediate channel feature data corresponding to the first channel pixel value included in the intermediate upsampling feature data output by the last upsampling module is input into the additional upsampling module to obtain the upsampling feature data output by the additional upsampling module.
  • the input data of the i-th upsampling module is downsampling feature data.
  • the input data of the i-th upsampling module is the i-1th upsampling.
  • the intermediate upsampling feature data output by the module and the intermediate downsampling feature data output by the downsampling module corresponding to the i-th upsampling module are fused to obtain the aggregated feature data. From this, the deep features and shallow features in the target image data can be analyzed. Layer features and features of different resolutions can be fully integrated to improve the effect of noise reduction processing.
  • the noise reduction image data output by the image noise reduction model should also include noise reduction image data of different channels with different resolutions.
  • Each upsampling module upsamples the input data to obtain the intermediate upsampled feature data output by the last upsampling module.
  • the intermediate upsampling feature data output by the last upsampling module includes the first intermediate channel feature data and the second intermediate channel feature data.
  • the first intermediate channel feature data and the first channel pixel value that is, the Y channel image data
  • the first middle channel feature data is obtained by denoising the pixel values of the first channel
  • the second middle channel feature data is obtained by denoising the pixel values of the second channel.
  • the output layer performs further fusion processing based on the upsampled feature data and the first channel pixel value in the target image data.
  • obtaining denoised image data based on the upsampling feature data and the target image data by the output layer includes: inputting the first channel pixel value in the upsampling feature data and the target image data into the output layer. Perform fusion processing to obtain the candidate noise reduction image data output by the output layer.
  • the noise reduction image data is obtained according to the second intermediate channel feature data corresponding to the second channel pixel value included in the candidate noise reduction image data and the intermediate upsampling feature data output by the last upsampling module.
  • the upsampling feature data contains noise features corresponding to the pixel values of the first channel, that is, the noise residual of the target image is extracted.
  • the upsampling feature data and the first channel pixel value are fused through the output layer, that is, the noise residual and the target image are superimposed in the output layer, and the noise features in the first channel pixel value can be removed, and we get Candidate denoised image data output by the output layer.
  • the denoised image data can be obtained based on the candidate image denoising data and the second intermediate channel feature data included in the intermediate upsampling feature data output by the last upsampling module.
  • the multi-input and output image noise reduction model can also achieve noise reduction processing, expanding the application of ISP chips.
  • the additional downsampling module is composed of a cascaded downsampling layer and a convolutional layer
  • the additional upsampling module is composed of a cascaded convolutional layer and an upsampling layer.
  • FIG. 9 shows a schematic structural diagram of another noise reduction neural network provided by an embodiment of the present application.
  • the image denoising model shown in Figure 9 includes two down-sampling modules and two up-sampling modules.
  • the fusion module and other fusion processes in the output layer and down-sampling module are all based on element-wise addition.
  • the downsampling layer and the upsampling layer can be downsampled through convolution processing.
  • the image noise reduction processing module is an important module in the ISP chip.
  • the input and output formats and data arrangement methods must be consistent with the noise reduction processing module in the traditional ISP chip to reduce the impact on the ISP.
  • the original layout of the chip is changed to speed up the application process. Therefore, embodiments of the present application provide a noise reduction neural network with multiple input and output formats, replacing the original noise reduction module without significantly modifying the layout of the ISP chip.
  • the embodiments of this application use a multiple-input multiple-output network structure and are embedded in the ISP chip, which better solves the problem that the single-image noise reduction neural network cannot adapt to the ISP chip layout and real-time performance. question.
  • the image denoising model in the embodiment of the present application can fully integrate the information of each channel pixel value in the target image data, and can better mine the information in the image and eliminate the information in the image. noise.
  • the ISP noise reduction algorithm of channel fusion can effectively improve the signal-to-noise ratio of the image.
  • the use of lightweight noise reduction neural networks can greatly improve the clarity of images and reduce noise. While reducing noise, due to the improvement of image quality, the trailing noise of moving objects in the image can also be improved, which can improve the accuracy of the target image in subsequent image tasks such as target detection or face recognition, and expand the target after noise reduction processing.
  • Image application scope is reducing noise, due to the improvement of image quality, the trailing noise of moving objects in the image can also be improved, which can improve the accuracy of the target image in subsequent image tasks such as target detection or face recognition, and expand the target after noise reduction processing.
  • each downsampling module includes a first downsampling module, a second downsampling module, and a fusion module cascaded with both the first downsampling module and the second downsampling module; the first downsampling module includes a first volume The product layer and the first downsampling layer, the second downsampling module includes the second downsampling layer.
  • the processing process of the downsampling module will be described below.
  • performing down-sampling processing on the input data of the i-th down-sampling module to obtain the intermediate down-sampling feature data output by the i-th down-sampling module includes: using the first down-sampling layer to down-sample the i-th The input data of the module is subjected to down-sampling processing to obtain the first down-sampling feature data output by the first down-sampling layer; the first down-sampling feature data is convolved using the first convolution layer to obtain the first down-sampling feature data output by the first convolution layer.
  • the first convolution feature data use the second down-sampling layer to down-sample the input data of the i-th down-sampling module to obtain the second down-sampling feature data output by the second down-sampling layer; use the fusion module to down-sample the first volume
  • the product feature data and the second down-sampled feature data are fused to obtain the intermediate down-sampled feature data output by the fusion module.
  • the first downsampling layer and the second downsampling layer may implement downsampling processing through convolution processing.
  • each downsampling module can select an appropriate convolution structure.
  • the first downsampling layer and the first convolution layer can be selected to be convolution processing with a 5x5 convolution kernel
  • the second convolution layer can be selected to be a 3x3 convolution. Kernel convolution processing, etc.
  • the convolution processing of each layer in the downsampling module can use any combination of direct connection, 1x1 convolution, 3x3 convolution, 5x5 convolution, 7x7 convolution, etc., and the embodiment of the present application does not specifically limit this.
  • the upsampling module includes a cascaded second convolution layer and an upsampling layer; the input data of the i-th upsampling module is upsampled to obtain the intermediate upsampling output of the i-th upsampling module.
  • Sampling feature data includes: using the second convolution layer to perform convolution processing on the input data of the i-th upsampling module to obtain the second convolution feature data output by the second convolution layer; using the upsampling layer to perform convolution processing on the second convolution feature data.
  • the product feature data is subjected to upsampling processing to obtain the intermediate upsampling feature data output by the upsampling layer.
  • the upsampling layer upsamples the input data of the upsampling layer through convolution processing, unpooling processing or interpolation processing.
  • each upsampling module may also include other numbers of convolutional layers and upsampling layers, which may be determined based on the computing power, bandwidth or storage energy parameters of the ISP chip.
  • the second convolution layer and the upsampling layer can choose an appropriate convolution structure.
  • the convolution processing of each layer can use any combination of direct connection, 1x1 convolution, 3x3 convolution, 5x5 convolution, 7x7 convolution, etc.
  • the embodiments of the present application do not specifically limit this.
  • the fusion processing can fuse the deep and shallow features of the image, which can all be performed by element.
  • This fusion process can be achieved by adding or splicing channels.
  • Figure 10 shows a schematic diagram of an element-wise fusion process provided by an embodiment of the present application
  • Figure 11 shows a schematic diagram of a channel-based splicing fusion process provided by an embodiment of the present application. Schematic diagram.
  • the element-wise fusion of feature channels can significantly reduce the amount of data reading and calculation, but it may also lose some of the extracted features. If the chip has enough computing power and cache, you can choose to use channel splicing.
  • the channel-by-channel addition method needs to ensure that the resolution and channel number of the two groups of fused features are exactly the same, while the channel splicing method does not require the two groups of fused features to have the same number of channels. Based on this, for different ISP chips, different channel fusion methods and upsampling methods are selected, and there are slight differences in the final noise reduction effect, but for some chips, the difference in computing time and efficiency is very large. Therefore, the fusion processing method can be determined based on the predetermined image processing requirements of the ISP chip, specifically element-by-element addition or channel splicing, to fully improve the efficiency of the chip's image processing.
  • the embodiment of this application provides a neural network noise reduction algorithm deployed on an ISP chip.
  • the main deployment process is as follows:
  • Step 1 Construct the basic structure of the U-Net neural network. Specifically, it includes determining the network framework, the input and output format of the network (RAW, RGB, YUV444, etc.), the number of downsampling and upsampling layers, and the sampling rate of each layer. .
  • Step 2 Determine the structure of the downsampling module in the network structure.
  • Step 3 Select the channel fusion method and upsampling method.
  • channel fusion uses element-wise addition processing
  • upsampling uses transposed convolution processing to achieve upsampling.
  • Step 4 Determine the specific position of the network embedded in the ISP chip image processing process, such as RAW image denoising module, RGB image denoising module or YUV image denoising module.
  • Step 5 Run and debug the complete neural network denoising algorithm.
  • the entire network can include 3 multi-convolution parallel sub-modules for down-sampling processing, and corresponding 3 ordinary convolution layers for up-sampling processing. and 3 upsampling layers.
  • the features of the original resolution are retained before each downsampling and fused after the corresponding upsampling layer, so that the deep and shallow features in the image and the features of different resolutions can be fully fused.
  • the entire network has a total of 15 layers of convolution and upsampling layers. An activation layer is added after each layer of convolution.
  • the data sizes output by the 15-layer network are 128x128x16, 128x128x16, and 128x128x16 respectively.
  • 64x64x32, 64x64x32, 64x64x32, 32x32x64, 32x32x64, 32x32x64, 32x32x16, 64x64x16, 64x64x16, 128x128x16, 128x128x16, 256x256x3 (YUV format output).
  • the number of downsampling layers in step 1 can be N layers, and N is greater than or equal to 1. If the ISP chip has sufficient computing power and cache, N can be set to an integer of 4 or greater.
  • the downsampling rate in step 1 can be N:1, and N is greater than 1.
  • N can be set to 3, 4, etc.
  • Different downsampling layers can also use different sampling rates.
  • the upsampling method in step 3 can be anti-pooling or interpolation algorithm, etc.
  • Anti-pooling or interpolation methods can effectively reduce the number of parameters in the upsampling layer.
  • the information of each channel of the image is fully integrated in the noise reduction process, which can better mine the information in the image and remove the noise in the image.
  • the channel fusion ISP noise reduction algorithm can effectively improve the signal-to-noise ratio of the image.
  • the use of lightweight noise reduction neural networks can greatly improve the clarity of images and reduce noise. While reducing noise, due to the improvement of image quality, the trailing noise of moving objects in the image can also be improved, which can improve the accuracy of image tasks such as target detection and face recognition.
  • the neural network used in the embodiments of this application has improved network structure and basic operators, and can run better on ISP chips in real time, thus solving the problem that ordinary neural networks cannot run on mobile devices. Real-time running issues.
  • the embodiments of this application use a multi-input multi-output network structure, which is embedded in the image processing process of the ISP chip and better solves the problem that the single-image denoising neural network cannot adapt to the ISP. Problems with the image processing flow of the chip.
  • embodiments of the present application also provide an image noise reduction processing device for implementing the above-mentioned image noise reduction processing method.
  • the implementation solution provided by this device to solve the problem is similar to the implementation solution recorded in the above method. Therefore, the specific limitations in the one or more image noise reduction processing device embodiments provided below can be found in the above image noise reduction processing. The limitations of the method will not be repeated here.
  • an image noise reduction processing device is provided.
  • the image noise reduction processing device 1200 includes: a noise reduction module 1201, wherein:
  • the noise reduction module 1201 is used to input the target image data into the image noise reduction model to obtain the noise reduction image data output by the image noise reduction model.
  • the target image data includes the pixel values of each channel of the target image; wherein, the image noise reduction model It includes a cascaded downsampling model, an upsampling model and an output layer.
  • the downsampling model includes n cascaded downsampling modules.
  • the upsampling model includes n cascaded upsampling modules that correspond to the n downsampling modules one-to-one.
  • the down-sampling module includes a first down-sampling module, a second down-sampling module, and a fusion module cascaded with both the first down-sampling module and the second down-sampling module; the first down-sampling module includes a cascaded first down-sampling layer and a first convolutional layer, the second downsampling module includes a second downsampling layer.
  • the noise reduction module 1201 is specifically used to: input target image data into the downsampling model, and perform downsampling processing on the target image data by each downsampling module in the downsampling model to obtain downsampled feature data. ; Input the down-sampled feature data into the up-sampling model, and each up-sampling module in the up-sampling model up-samples the down-sampled feature data to obtain up-sampled feature data; the output layer is based on the up-sampled feature data and target image data Get noise-reduced image data.
  • the noise reduction module 1201 is specifically used to: for the i-th upsampling module, perform upsampling processing on the input data of the i-th upsampling module to obtain the intermediate upsampling output of the i-th upsampling module.
  • the input data of the i-th upsampling module is down-sampling feature data, and when i is greater than 1, the input data of the i-th upsampling module is the i-1 Aggregated feature data obtained by fusion processing of the intermediate upsampling feature data output by the upsampling module and the intermediate downsampling feature data output by the downsampling module corresponding to the i-th upsampling module; the intermediate upsampling output by the last upsampling module is Feature data as upsampled feature data.
  • the noise reduction module 1201 is specifically configured to: input the upsampling feature data and the target image data to the output layer for fusion processing to obtain the noise reduction image data output by the output layer.
  • the downsampling model also includes an additional downsampling module, the noise reduction module 1201, which is specifically used to: convert the first channel pixels of the target image contained in the target image data.
  • the value is input to the additional down-sampling module to obtain the channel feature data output by the additional down-sampling module; the channel feature data is fused with the second channel pixel value of the target image contained in the target image data to obtain the candidate target image data; for the third
  • the input data of the sampling module is the candidate target image data.
  • the input data of the i-th down-sampling module is the intermediate down-sampling feature data output by the i-1 down-sampling module; the last down-sampling
  • the intermediate down-sampled feature data output by the module is used as down-sampled feature data.
  • the upsampling model also includes an additional upsampling module, the noise reduction module 1201, which is specifically used to: for the i-th upsampling module, perform upsampling processing on the input data of the i-th upsampling module to obtain the i-th upsampling module.
  • the noise reduction module 1201 is specifically used to: for the i-th upsampling module, perform upsampling processing on the input data of the i-th upsampling module to obtain the i-th upsampling module.
  • the input data of the sampling module is the aggregated feature data obtained by fusion processing of the intermediate upsampling feature data output by the i-1th upsampling module and the intermediate downsampling feature data output by the downsampling module corresponding to the i-th upsampling module;
  • the first intermediate channel feature data corresponding to the first channel pixel value included in the intermediate upsampling feature data output by the last upsampling module is input into the additional upsampling module to obtain the upsampling feature data output by the additional upsampling module.
  • the noise reduction module 1201 is specifically configured to: input the upsampling feature data and the first channel pixel value in the target image data into the output layer for fusion processing, and obtain the candidate noise reduction image data output by the output layer. ; Obtain noise reduction image data according to the second intermediate channel feature data corresponding to the second channel pixel value included in the candidate noise reduction image data and the intermediate upsampling feature data output by the last upsampling module.
  • the noise reduction module 1201 is specifically configured to use the first down-sampling layer to down-sample the input data of the i-th down-sampling module to obtain the first down-sampling feature data output by the first down-sampling layer.
  • the upsampling module includes a cascaded second convolution layer and an upsampling layer; the noise reduction module 1201 is specifically used to: use the second convolution layer to convolve the input data of the i-th upsampling module. product processing to obtain the second convolution feature data output by the second convolution layer; use the upsampling layer to perform upsampling processing on the second convolution feature data to obtain the intermediate upsampling feature data output by the upsampling layer.
  • the image noise reduction model is used in the RAW image noise reduction module 1201, RGB image noise reduction module 1201 or YUV image noise reduction module 1201 in the ISP chip; correspondingly, the format of the target image is RAW format, RGB format or YUV format.
  • the upsampling layer performs upsampling processing on the input data of the upsampling layer through convolution processing, unpooling processing or interpolation processing.
  • Each module in the above image noise reduction processing device can be implemented in whole or in part by software, hardware, and combinations thereof.
  • Each of the above modules may be embedded in or independent of the processor of the computer device in the form of hardware, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure diagram may be shown in Figure 13 .
  • the computer device includes a processor, memory, communication interface, display screen and input device connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes non-volatile storage media and internal memory.
  • the non-volatile storage medium stores operating systems and computer programs. This internal memory provides an environment for the execution of operating systems and computer programs in non-volatile storage media.
  • the communication interface of the computer device is used for wired or wireless communication with external terminals.
  • the wireless mode can be implemented through WIFI, mobile cellular network, NFC (Near Field Communication) or other technologies.
  • the computer program implements an image noise reduction processing method when executed by a processor.
  • the display screen of the computer device may be a liquid crystal display or an electronic ink display.
  • the input device of the computer device may be a touch layer covered on the display screen, or may be a button, trackball or touch pad provided on the computer device shell. , it can also be an external keyboard, trackpad or mouse, etc.
  • Figure 13 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • Specific computer equipment can May include more or fewer parts than shown, or combine certain parts, or have a different arrangement of parts.
  • an electronic device including a memory and a processor.
  • a computer program is stored in the memory.
  • the processor executes the computer program, it implements the steps in the above method embodiments.
  • a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the steps in the above method embodiments are implemented.
  • a computer program product including a computer program that implements the steps in each of the above method embodiments when executed by a processor.
  • the computer program can be stored in a non-volatile computer-readable storage.
  • the computer program when executed, may include the processes of the above method embodiments.
  • Any reference to memory, database or other media used in the embodiments provided in this application may include at least one of non-volatile and volatile memory.
  • Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive memory (ReRAM), magnetic variable memory (Magnetoresistive Random Access Memory (MRAM), ferroelectric memory (Ferroelectric Random Access Memory, FRAM), phase change memory (Phase Change Memory, PCM), graphene memory, etc.
  • Volatile memory may include random access memory (Random Access Memory, RAM) or external cache memory, etc.
  • RAM Random Access Memory
  • RAM random access memory
  • RAM Random Access Memory
  • the databases involved in the various embodiments provided in this application may include at least one of a relational database and a non-relational database.
  • Non-relational databases may include blockchain-based distributed databases, etc., but are not limited thereto.
  • the processors involved in the various embodiments provided in this application may be general-purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to this.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

本申请涉及一种图像降噪处理方法、装置、设备、存储介质和程序产品。所述方法包括:将目标图像数据输入至图像降噪模型中,得到降噪图像数据,目标图像数据包括目标图像的各个通道的像素值;其中,图像降噪模型包括级联的下采样模型、上采样模型和输出层,下采样模型包括n个级联的下采样模块,上采样模型包括与n个下采样模块一一对应的n个级联的上采样模块;下采样模块包括第一下采样模块、第二下采样模块以及与第一下采样模块和第二下采样模块均级联的融合模块;第一下采样模块包括第一卷积层和第一下采样层,第二下采样模块包括第二下采样层。采用本方法能够满足ISP芯片的实时性要求且用于对视频图像进行降噪处理。

Description

图像降噪处理方法、装置、设备、存储介质和程序产品
相关申请
本申请要求2022年09月16日申请的,申请号为2022111286666,名称为“图像降噪处理方法、装置、设备、存储介质和程序产品”的中国专利申请的优先权,在此将其全文引入作为参考。
技术领域
本申请涉及图像处理技术领域,特别是涉及一种图像降噪处理方法、装置、设备、存储介质和程序产品。
背景技术
图像降噪技术是图像处理领域的重点工作。而ISP(Image Signal Processor,图像信号处理器)芯片主要用于对终端拍摄的实时视频图像进行图像处理,在图像降噪处理方面,其对图像降噪算法的实时性要求较高。
然而,目前的图像降噪算法无法同时具备降噪效果优异且满足ISP芯片的实时性要求的能力。
发明内容
基于此,有必要针对上述技术问题,提供一种能够满足ISP芯片的实时性要求且用于对视频图像进行降噪处理的图像降噪处理方法、装置、设备、存储介质和程序产品。
第一方面,本申请提供了一种图像降噪处理方法。该方法包括:
将目标图像数据输入至图像降噪模型中,得到该图像降噪模型输出的降噪图像数据,该目标图像数据包括目标图像的各个通道的像素值;其中,该图像降噪模型包括级联的下采样模型、上采样模型和输出层,该下采样模型包括n个级联的下采样模块,该上采样模型包括与n个该下采样模块一一对应的n个级联的上采样模块;该下采样模块包括第一下采样模块、第二下采样模块以及与该第一下采样模块和该第二下采样模块均级联的融合模块;该第一下采样模块包括级联的第一下采样层和第一卷积层,该第二下采样模块包括第二下采样层。
在其中一个实施例中,该将目标图像数据输入至图像降噪模型中,得到该图像降噪模型输出的降噪图像数据,包括:将该目标图像数据输入至该下采样模型中,由该下采样模型中的各下采样模块对该目标图像数据进行下采样处理,得到下采样特征数据;将该下采样特征数据输入至该上采样模型中,由该上采样模型中的各上采样模块对该下采样特征数据进行上采样处理,得到上采样特征数据;由该输出层基于该上采样特征数据和该目标图像数据得到该降噪图像数据。
在其中一个实施例中,目标图像各通道的图像数据分辨率相同,该由该下采样模型中的各下采样模块对该目标图像数据进行下采样处理,得到下采样特征数据,包括:对于第i个下采样模块,对该第i个下采样模块的输入数据进行下采样处理,得到该第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,该第i个下采样模块的输入数据为该目标图像数据,在i大于1的情况下,该第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个该下采样模块输出的中间下采样特征数据作为该下采样特征数据。
在其中一个实施例中,该由该上采样模型中的各上采样模块对该下采样特征数据进行上采样处理,得到上采样特征数据,包括:对于第i个上采样模块,对该第i个上采样模块的输入数据进行上采样处理,得到该第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,该第i个上采样模块的输入数据为该下采样特征数据,在i大于1的情况下,该第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与该第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个该上采样模块输出的中间上采样特征数据作为该上采样特征数据。
在其中一个实施例中,该由该输出层基于该上采样特征数据和该目标图像数据得到该降噪图像数据,包括:将该上采样特征数据和该目标图像数据输入至该输出层进行融合处理,得到该输出层输出的该降 噪图像数据。
在其中一个实施例中,目标图像各通道的图像数据分辨率不同,该下采样模型还包括附加下采样模块,该将该目标图像数据输入至该下采样模型中,由该下采样模型中的各下采样模块对该目标图像数据进行下采样处理,得到下采样特征数据,包括:
将该目标图像数据中包含的该目标图像的第一通道像素值输入至该附加下采样模块,得到该附加下采样模块输出的通道特征数据;将该通道特征数据与该目标图像数据中包含的该目标图像的第二通道像素值进行融合处理,得到候选目标图像数据;对于第i个下采样模块,对该第i个下采样模块的输入数据进行下采样处理,得到该第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,该第i个下采样模块的输入数据为该候选目标图像数据,在i大于1的情况下,该第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个该下采样模块输出的中间下采样特征数据作为该下采样特征数据。
在其中一个实施例中,该上采样模型还包括附加上采样模块,该将该下采样特征数据输入至该上采样模型中,由该上采样模型中的各上采样模块对该下采样特征数据进行上采样处理,得到上采样特征数据,包括:
对于第i个上采样模块,对该第i个上采样模块的输入数据进行上采样处理,得到该第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,该第i个上采样模块的输入数据为该下采样特征数据,在i大于1的情况下,该第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与该第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个上采样模块输出的中间上采样特征数据中包括的与该第一通道像素值对应的第一中间通道特征数据输入至该附加上采样模块中,得到该附加上采样模块输出的该上采样特征数据。
在其中一个实施例中,由该输出层基于该上采样特征数据和该目标图像数据得到该降噪图像数据,包括:
将该上采样特征数据和该目标图像数据中的该第一通道像素值输入至该输出层中进行融合处理,得到该输出层输出的候选降噪图像数据;根据该候选降噪图像数据和该最后一个上采样模块输出的中间上采样特征数据中包括的与该第二通道像素值对应的第二中间通道特征数据得到该降噪图像数据。
在其中一个实施例中,对该第i个下采样模块的输入数据进行下采样处理,得到该第i个下采样模块输出的中间下采样特征数据,包括:
利用该第一下采样层对该第i个下采样模块的输入数据进行下采样处理,得到该第一下采样层输出的第一下采样特征数据;利用该第一卷积层对该第一下采样特征数据进行卷积处理,得到该第一卷积层输出的第一卷积特征数据;利用该第二下采样层对该第i个下采样模块的输入数据进行下采样处理,得到该第二下采样层输出的第二下采样特征数据;利用该融合模块对该第一卷积特征数据和该第二下采样特征数据进行融合处理,得到该融合模块输出的该中间下采样特征数据。
在其中一个实施例中,该上采样模块包括级联的第二卷积层和上采样层;该对该第i个上采样模块的输入数据进行上采样处理,得到该第i个上采样模块输出的中间上采样特征数据,包括:
利用该第二卷积层对该第i个上采样模块的输入数据进行卷积处理,得到该第二卷积层输出的第二卷积特征数据;利用该上采样层对该第二卷积特征数据进行上采样处理,得到该上采样层输出的该中间上采样特征数据。
在其中一个实施例中,该图像降噪模型用于ISP芯片中的RAW图像降噪模块、RGB图像降噪模块或者YUV图像降噪模块中;对应的,该目标图像的格式为RAW格式、RGB格式或者YUV格式。
在其中一个实施例中,该上采样层通过卷积处理、反池化处理或者插值处理对该上采样层的输入数据进行上采样处理。
第二方面,本申请还提供了一种图像降噪处理装置。该装置包括:
降噪模块,用于将目标图像数据输入至图像降噪模型中,得到该图像降噪模型输出的降噪图像数据,该目标图像数据包括目标图像的各个通道的像素值;
其中,该图像降噪模型包括级联的下采样模型、上采样模型和输出层,该下采样模型包括n个级联的下采样模块,该上采样模型包括与n个该下采样模块一一对应的n个级联的上采样模块;该下采样模块包括第一下采样模块、第二下采样模块以及与该第一下采样模块和该第二下采样模块均级联的融合模块;该第一下采样模块包括级联的第一下采样层和第一卷积层,该第二下采样模块包括第二下采样层。
在其中一个实施例中,该降噪模块,具体用于:
将该目标图像数据输入至该下采样模型中,由该下采样模型中的各下采样模块对该目标图像数据进行下采样处理,得到下采样特征数据;将该下采样特征数据输入至该上采样模型中,由该上采样模型中的各上采样模块对该下采样特征数据进行上采样处理,得到上采样特征数据;由该输出层基于该上采样特征数据和该目标图像数据得到该降噪图像数据。
在其中一个实施例中,目标图像各通道的图像数据分辨率相同,该降噪模块,具体用于:
对于第i个下采样模块,对该第i个下采样模块的输入数据进行下采样处理,得到该第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,该第i个下采样模块的输入数据为该目标图像数据,在i大于1的情况下,该第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个该下采样模块输出的中间下采样特征数据作为该下采样特征数据。
在其中一个实施例中,该降噪模块,具体用于:
对于第i个上采样模块,对该第i个上采样模块的输入数据进行上采样处理,得到该第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,该第i个上采样模块的输入数据为该下采样特征数据,在i大于1的情况下,该第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与该第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个该上采样模块输出的中间上采样特征数据作为该上采样特征数据。
在其中一个实施例中,该降噪模块,具体用于:
将该上采样特征数据和该目标图像数据输入至该输出层进行融合处理,得到该输出层输出的该降噪图像数据。
在其中一个实施例中,目标图像各通道的图像数据分辨率不同,该下采样模型还包括附加下采样模块,该降噪模块,具体用于:
将该目标图像数据中包含的该目标图像的第一通道像素值输入至该附加下采样模块,得到该附加下采样模块输出的通道特征数据;将该通道特征数据与该目标图像数据中包含的该目标图像的第二通道像素值进行融合处理,得到候选目标图像数据;对于第i个下采样模块,对该第i个下采样模块的输入数据进行下采样处理,得到该第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,该第i个下采样模块的输入数据为该候选目标图像数据,在i大于1的情况下,该第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个该下采样模块输出的中间下采样特征数据作为该下采样特征数据。
在其中一个实施例中,该上采样模型还包括附加上采样模块,该降噪模块,具体用于:
对于第i个上采样模块,对该第i个上采样模块的输入数据进行上采样处理,得到该第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,该第i个上采样模块的输入数据为该下采样特征数据,在i大于1的情况下,该第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与该第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个上采样模块输出的中间上采样特征数据中包括的与该第一通道像素值对应的第一中间通道特征数据输入至该附加上采样模块中,得到该附加上采样模块输出的该上采样特征数据。
在其中一个实施例中,该降噪模块,具体用于:
将该上采样特征数据和该目标图像数据中的该第一通道像素值输入至该输出层中进行融合处理,得到该输出层输出的候选降噪图像数据;根据该候选降噪图像数据和该最后一个上采样模块输出的中间上采样特征数据中包括的与该第二通道像素值对应的第二中间通道特征数据得到该降噪图像数据。
在其中一个实施例中,该降噪模块,具体用于:
利用该第一下采样层对该第i个下采样模块的输入数据进行下采样处理,得到该第一下采样层输出的第一下采样特征数据;利用该第一卷积层对该第一下采样特征数据进行卷积处理,得到该第一卷积层输出的第一卷积特征数据;利用该第二下采样层对该第i个下采样模块的输入数据进行下采样处理,得到该第二下采样层输出的第二下采样特征数据;利用该融合模块对该第一卷积特征数据和该第二下采样特征数据进行融合处理,得到该融合模块输出的该中间下采样特征数据。
在其中一个实施例中,该上采样模块包括级联的第二卷积层和上采样层;该降噪模块,具体用于:
利用该第二卷积层对该第i个上采样模块的输入数据进行卷积处理,得到该第二卷积层输出的第二卷积特征数据;利用该上采样层对该第二卷积特征数据进行上采样处理,得到该上采样层输出的该中间上采样特征数据。
在其中一个实施例中,该图像降噪模型用于ISP芯片中的RAW图像降噪模块、RGB图像降噪模块或者YUV图像降噪模块中;对应的,该目标图像的格式为RAW格式、RGB格式或者YUV格式。
在其中一个实施例中,该上采样层通过卷积处理、反池化处理或者插值处理对该上采样层的输入数据进行上采样处理。
第三方面,本申请还提供了一种电子设备,包括存储器和处理器,该存储器存储有计算机程序,该处理器执行该计算机程序时实现上述第一方面任一项所述的方法的步骤。
第四方面,本申请还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述第一方面中任一项所述的方法的步骤。
第五方面,本申请还提供了一种计算机程序产品,该计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现上述第一方面中任一项所述的方法的步骤。
上述图像降噪处理方法、装置、设备、存储介质和程序产品,可直接将包括目标图像的各个通道的像素值的目标图像数据输入至图像降噪模型中,即可得到该图像降噪模型输出的降噪图像数据,实现对目标图像的降噪处理。其中,通常在ISP芯片中,需要基于YUV图像的Y通道的像素值以及UV通道的像素值分别进行降噪处理,得到降噪处理后的图像数据,由于Y通道和UV通道同时处理,需要反复调用图像数据,其处理效率较差,无法满足ISP芯片的实时性要求,而本申请中,由于可以直接将包括目标图像的各个通道的像素值的目标图像数据输入至图像降噪模型中进行降噪处理,也即对目标图像各个通道的数据同时进行降噪处理,相比于各通道分开进行降噪处理,其数据量和计算量大大减小,可以有效提升数据处理效率,满足实时性的要求;并且,在降噪处理过程中,目标图像中各通道之间的信息可以相互参考从而达到更优异的降噪处理效果。其中,该图像降噪模型的网络结构精简,其包括级联的下采样模型、上采样模型和输出层,该下采样模型包括n个级联的下采样模块,该上采样模型包括与n个该下采样模块一一对应的n个级联的上采样模块;该下采样模块包括第一下采样模块、第二下采样模块以及与该第一下采样模块和该第二下采样模块均级联的融合模块;该第一下采样模块包括级联的第一下采样层和第一卷积层,该第二下采样模块包括第二下采样层,通过精简的图像降噪模型即可实现对目标图像数据的有效降噪处理,使得该图像降噪模型可以完全适配于ISP芯片中用于对实时的视频图像进行降噪处理。
附图说明
为了更清楚地说明本申请实施例或传统技术中的技术方案,下面将对实施例或传统技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据公开的附图获得其他的附图。
图1为一个实施例中图像降噪模型的结构示意图;
图2为一个实施例中多卷积并行模块的结构示意图;
图3为一个实施例中传统ISP芯片图像处理流程示意图;
图4为一个实施例中第一种改进的ISP芯片图像处理流程示意图;
图5为一个实施例中第二种改进的ISP芯片图像处理流程示意图;
图6为一个实施例中降噪处理的流程示意图;
图7为一个实施例中一种降噪神经网络的结构示意图;
图8为一个实施例中另一种图像降噪模型的结构示意图;
图9为一个实施例中另一种降噪神经网络的结构示意图;
图10为一个实施例中按元素加的融合处理的示意图;
图11为一个实施例中按通道拼接的融合处理的示意图;
图12为一个实施例中图像降噪处理装置的结构框图;
图13为一个实施例中计算机设备的内部结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
在图像处理领域,图像降噪一直是难以完美处理的一个图像处理工作。图像降噪是图像还原技术中的一种,其目的是要准确找出图像中的信号值或者噪声值,或者说把图像中的信号部分和噪声部分分离。
图像降噪算法目前主要分为传统算法和基于神经网络的算法。现有的基本情况是传统算法的降噪效果较差,无法满足降噪效果的需求;而神经网络算法计算量非常大且对现有芯片不够友好,无法满足ISP芯片实时性的需求。
其中,传统的图像降噪按照分离信号噪声的特征空间可以分为空域降噪、频域降噪、空频域结合降噪;按照降噪处理使用的图像范围可以分为局域降噪和非局域降噪。具体的传统降噪有均值滤波、中值滤波、高斯滤波、双边滤波、非局域均值滤波、导向滤波、离散余弦域滤波、小波变换域滤波等。传统的降噪方法都是基于统计学上信号和噪声特征差异的简单假设,使用固定的一套方式将信号和噪声分离开。由于对噪声的特征假设过于简单,分离噪声的时候会同时夹杂一部分信号,或者分离噪声不够彻底使得有噪声残留,在实际场景中尤其是噪声较为显著时(比如低照度环境成像)降噪效果较差。
近年来,除去传统的图像降噪算法,各种基于神经网络的图像降噪算法极大地提升了图像降噪的效果。此类神经网络的几个代表性网络有直链式的DnCNN(Denoising Convolutional Neural Network,去噪卷积神经网络)、包含评估噪声水平子网络的CBDNet(卷积盲去噪网络),基于注意力机制的RIDNet(基于特征注意的真实图像去噪)等。基于神经网络的图像降噪算法优点是效果较传统算法明显提升,但相应的缺点是计算量远超传统算法,在实际应用中尤其是对实时性要求较高的ISP(Image Signal Processor)芯片上难以实现。
有鉴于此,本申请实施例中,提供了一种满足ISP芯片实时性要求的可以较好的对视频图像降噪处理的图像降噪处理方法。
在一个实施例中,提供了一种图像降噪处理方法,本申请实施例以该方法应用于包括ISP芯片的终端中进行举例说明,可以理解的是,该方法也可以应用于服务器,还可以应用于包括终端和服务器的系统,并通过终端和服务器的交互实现。具体的,该方法的执行主体可以是终端中的ISP芯片。终端可以但不限于是各种计算机设备或者拍摄设备等,服务器可以用独立的服务器或者是多个服务器组成的服务 器集群来实现。
本申请实施例中,该方法包括:将目标图像数据输入至图像降噪模型中,得到图像降噪模型输出的降噪图像数据,目标图像数据包括目标图像的各个通道的像素值。其中,如图1所示,其示出了本申请实施例提供的一种图像降噪模型的结构示意图,该图像降噪模型包括级联的下采样模型、上采样模型和输出层,下采样模型包括n个级联的下采样模块,上采样模型包括与n个下采样模块一一对应的n个级联的上采样模块;下采样模块包括第一下采样模块、第二下采样模块以及与第一下采样模块和第二下采样模块均级联的融合模块;第一下采样模块包括级联的第一下采样层和第一卷积层,第二下采样模块包括第二下采样层。
其中,ISP(Image Signal Processor,图像信号处理器)芯片用于获取终端中前端的图像传感器拍摄的图像进行一系列图像处理并输出处理后的图像。通常,ISP芯片对从图像传感器(Image Sensor)中以RAW格式传入的图像进行处理的步骤包括:对RAW图像经过坏点校正、暗电流校正、镜头阴影校正、RAW图像降噪、白平衡、颜色插值等得到RGB图像,再经过Gamma校正(伽马校正)、颜色校正、RGB图像转YUV图像等得到YUV图像,YUV图像经过降噪、边缘增强、亮度/对比度/色调/饱和度调整等处理后,最后对图像数据进行编码得到最终输出的视频图像。可选的,ISP芯片处理的图像可以是单图图像,也可以是连续帧构成的视频图像。ISP芯片中可集成各种图像处理算法以实现上文中的ISP芯片对图像的处理步骤。本申请实施例中,图像降噪模型即为应用于ISP芯片中用于实现图像降噪处理步骤的算法。
需要说明的是,图1中仅以3个下采样模块以及三个上采样模块作为示例,并不用于限定本申请。
本申请实施例中,目标图像为ISP芯片中需要进行降噪处理的图像,目标图像数据即为该目标图像的各个通道的像素值,其中,若前端图像传感器获取的是单图图像,则目标图像为单图图像;若前端图像处理器获取的是实时视频图像,则该目标图像为实时视频图像中的单帧图像。
可选的,该目标图像可以为RAW格式、RGB格式或者YUV格式等,本申请实施例对此不作具体限定。
该图像降噪模型可以根据所处理的目标图像的各个通道的图像数据的分辨率是否一致进行不同的处理。具体的,对于各个通道的图像数据分辨率一致的目标图像,该图像降噪模型为单输入输出,可以直接对输入的目标图像数据进行降噪处理;对于各个通道的图像数据分辨率不一致的目标图像,该图像降噪模型为多输入输出,可将目标图像数据中不同通道的图像数据分别输入至该图像降噪模型中以进行降噪处理。由此,使得该图像降噪模型可以适配于对各种类型的目标图像进行降噪处理。
其中,如今,图像降噪分为单图降噪和多帧图像联合降噪等类型。考虑到ISP芯片的计算量和数据缓存,单图降噪相比于多帧图像联合降噪更适合用于实时处理场景。结合实际降噪效果和网络计算复杂度、数据读写量等,本申请实施例结合U-Net网络,对网络的结构和计算单元进行优化,提出了一种非常精简的单图降噪网络结构,也即该图像降噪模型,该图像降噪模型的主体是网络的主体结构是U-Net网络。该图像降噪模型的下采样模型用于对目标图像数据进行下采样处理,上采样模型用于对上采样处理后得到的特征数据进行上采样处理,输出层用于基于上采样模型输出的数据以及目标图像数据输出该目标图像对应的降噪图像数据。各上采样模块的输入为上一个上采样模块的输出数据以及与该上采样模块对应的下采样模块的输出数据。由此,可以对深层浅层的图像特征进行融合处理,提升降噪效果。
该下采样模型由n个级联的下采样模块构成,各下采样模块为多卷积并行模块,用于提取更多的图像特征。为了对该多卷积并行模块有更清晰的认识,请参考图2,其示出了本申请实施例提供的一种多卷积并行模块的结构示意图,示例性的,该多卷积并行模块包括两个下采样层201、一个卷积层202以及融合层203。可选的,该多卷积并行模块200还可以包括其它数量的下采样层和卷积层,本申请实施例中对此不作具体限定,换言之,对应的,该下采样模块在包括第一卷积层、第一下采样层以及第二下采样层之外,还可以包括其它数量的卷积层和下采样层,具体可以基于ISP芯片本身的算力和带宽等参 数确定,本申请实施例对此不作具体限定。需要说明的是,图像降噪模型包括本申请实施例中所提供的该结构的下采样模块即可实现较好的下采样效果,在保证降噪效果的同时还可以满足ISP芯片的实时性要求。
相应的,该图像降噪模型中下采样模块以及上采样模块的个数可以基于ISP芯片本身的算力和带宽等参数确定,本申请实施例对此不作具体限定。
可选的,下采样模块中的融合模块的通道融合的方式可以选择按元素加或者通道拼接等方式,本申请实施例对此不作具体限定。
上述图像降噪处理方法,可直接将包括目标图像的各个通道的像素值的目标图像数据输入至图像降噪模型中,即可得到该图像降噪模型输出的降噪图像数据,实现对目标图像的降噪处理。其中,通常在ISP芯片中,需要基于YUV图像的Y通道的像素值以及UV通道的像素值分别进行降噪处理,得到降噪处理后的图像数据,由于Y通道和UV通道同时处理,需要反复调用图像数据,其处理效率较差,无法满足ISP芯片的实时性要求,而本申请中,由于可以直接将包括目标图像的各个通道的像素值的目标图像数据输入至图像降噪模型中进行降噪处理,也即对目标图像各个通道的数据同时进行降噪处理,相比于各通道分开进行降噪处理,其数据量和计算量大大减小,可以有效提升数据处理效率,满足实时性的要求;并且,在降噪处理过程中,目标图像中各通道之间的信息可以相互参考从而达到更优异的降噪处理效果。其中,该图像降噪模型的网络结构精简,其包括级联的下采样模型、上采样模型和输出层,该下采样模型包括n个级联的下采样模块,该上采样模型包括与n个该下采样模块一一对应的n个级联的上采样模块;该下采样模块包括第一下采样模块、第二下采样模块以及与该第一下采样模块和该第二下采样模块均级联的融合模块;该第一下采样模块包括第一卷积层和第一下采样层,该第二下采样模块包括第二下采样层,通过精简的图像降噪模型即可实现对目标图像数据的有效降噪处理,使得该图像降噪模型可以完全适配于ISP芯片中用于对实时的视频图像进行降噪处理。
在一个实施例中,图像降噪模型用于ISP芯片中的RAW图像降噪模块、RGB图像降噪模块或者YUV图像降噪模块中;对应的,目标图像的格式为RAW格式、RGB格式或者YUV格式。
请参考图3,其示出了本申请实施例提供的一种传统ISP芯片图像处理流程示意图,传统的ISP芯片的处理过程包括:获取前端图像传感器传输的视频图像中各帧对应的RAW图像,经过坏点校正、暗电流校正、镜头阴影校正、RAW图像降噪、白平衡、颜色插值等得到RGB图像,再经过Gamma校正、颜色校正、RGB转YUV等得到YUV图像,YUV图像的Y(亮度)通道数据进行降噪处理并进行边缘增强和亮度/对比度调整,YUV图像的UV(颜色)通道的数据进行色调/饱和度调整实现对YUV图像的降噪处理,然后对降噪处理后的YUV图像进行数据码得到最终输出的视频图像。
RAW图像是图像传感器采集格式,本质上是一种特殊的RGB格式。RAW图像经过一系列处理后做颜色插值得到普通的RGB图像,再经过一系列处理后转换得到YUV图像。YUV图像格式是一种把图像的亮度和颜色分离开的格式,其中亮度由Y通道表示,颜色由UV两个通道表示。YUV图像的处理中,Y通道和UV通道会分开处理会对Y(亮度)通道和UV(颜色)通道分别用降噪算法,然后对亮度分量做边缘增强、亮度和对比度调整,对颜色分量做色调和饱和度调整。
由于传统的ISP芯片处理中需要对YUV图像分通道处理,在此过程中需要反复读写各通道的图像数据进行处理,降噪效果较差且处理效率较低。基于此,本申请实施例中将该图像降噪模型应用于ISP芯片中,可直接用于对RGB图像、RAW图像或者YUV图像进行降噪处理。例如,将该图像降噪模型应用于ISP芯片中的RAW图像降噪模块,则对应的目标图像的格式为RAW格式;将该图像降噪模型应用于ISP芯片中的RGB图像降噪模块,则对应的目标图像的格式为RGB格式;将该图像降噪模型应用于ISP芯片中的YUV图像降噪模块,则对应的目标图像的格式为YUV格式。
以替代传统中YUV图像分通道降噪处理的过程,以提升降噪效果以及处理效率。
在一种情况中,请参考图4,其示出了本申请实施例提供的第一种改进的ISP芯片图像处理流程示 意图,将该图像降噪模型应用于ISP芯片中YUV图像降噪模块中,以对YUV图像的各个通道的数据直接进行降噪处理,该情况中,输入至图像降噪模型的目标图像数据为YUV图像各个通道的像素值。由此,以大大提升YUV图像的降噪处理效果和处理效率,在处理实时视频图像的场景中,该图像降噪模型的处理效率的成果更加显著。
在另一种情况中,请参考图5,其示出了本申请实施例提供的第二种改进的ISP芯片图像处理流程示意图,将该图像降噪模型应用于ISP芯片中的RGB图像降噪模块中,以对RGB图像的各个通道的数据进行降噪处理,然后ISP芯片可以直接将降噪处理得到的RGB图像转换得到YUV图像并进行后续的处理,而无需再次对YUV图像分通道进行降噪处理,提升图像降噪处理的效率。该情况中,输入至图像降噪模型的目标图像数据为RGB图像各个通道的像素值。
可选的,该图像降噪模型还可以应用于ISP芯片中的RAW图像降噪模块中,以对RAW图像直接进行降噪处理。对应的,该情况中,输入至图像降噪模型的目标图像数据为RAW图像各个通道的像素值。
本申请实施例中,对传统的ISP芯片中的降噪模块稍作调整,将亮度降噪和颜色降噪合并到一起通过图像降噪模型实现整体的降噪处理,可以提升降噪处理中提取图像信号的能力,提升信噪比和主观降噪效果。对于YUV图像来说,在图像各个通道(Y、U、V)降噪的同时可以参考各个通道之间的信息以达到较好的降噪效果;另外,合并通道进行处理实际计算量和数据读写量都少于分开通道处理的总和,因此,将图像的通道合并处理不论从性能还是效果上来说都是有正向收益的,因此,将该图像降噪模型应用于ISP芯片中进行图像降噪处理,无论从降噪效果还是实时性方面来说,均满足ISP芯片的要求。
下面将对该图像降噪模型对目标图像数据的具体处理过程进行说明。
在一个实施例中,如图6所示,其示出了本申请实施例提供的一种降噪处理的流程示意图;将目标图像数据输入至图像降噪模型中,得到图像降噪模型输出的降噪图像数据,包括:
步骤601,将目标图像数据输入至下采样模型中,由下采样模型中的各下采样模,块对目标图像数据进行下采样处理,得到下采样特征数据。
步骤602,将下采样特征数据输入至上采样模型中,由上采样模型中的各上采样模块对下采样特征数据进行上采样处理,得到上采样特征数据。
步骤603,由输出层基于上采样特征数据和目标图像数据得到降噪图像数据。
其中,下采样模型由n个级联的下采样模块构成,可选的,可以基于芯片的实际运算情况和存储情况等确定n的取值,从而确定下采样模型和上次采样模型的结构,申请实施例对上采样模块和下采样模块的个数不作具体限定,当然,上采样模块的数量应与下采样模块的数量保持一致。
各下采样模块用于对输入数据进行下采样处理,用于提取更多的图像特征。经过下采样处理后图像缩小,相应的,对应采用上采样模块进行上采样处理恢复图像大小。
n个级联的下采样模块中的最后一个模块输出的为下采样特征数据,n个级联的上采样模块中的最后一个模块输出的是上采样特征数据。
将该上采样特征数据以及目标图像数据均输入至输出层进行融合处理,并基于该输出层输出的数据即可得到该降噪图像数据。
本申请实施例中,目前几乎所有单图降噪(single image denoising)算法难以在ISP芯片上实时运行,申请通过精简网络结构,构造适应于芯片运行的轻量级神经网络模型,得到图像降噪模型,使得单图降噪算法可以应用于ISP芯片实时降噪,同时满足图像降噪效果和算法实时性的需求,解决神经网络在芯片上部署和实时运行的问题,同时,相对于传统芯片在降噪效果上有明显提升。
具体的,对于不同类型的目标图像对应的目标图像数据,其输入至该图像降噪模型中进行处理的过程也不同,下面对于两种类型的目标图像对应的目标图像数据处理的过程分别进行说明。
在一种情况中,该目标图像为各通道的图像数据分辨率相同的格式的图像,例如RAW格式、RGB格式、YUV444格式的目标图像,此时,由下采样模型中的各下采样模块对目标图像数据进行下采样处理,得到下采样特征数据,包括:
对于第i个下采样模块,对第i个下采样模块的输入数据进行下采样处理,得到第i个下采样模块输出的中间下采样特征数据;将最后一个下采样模块输出的中间下采样特征数据作为下采样特征数据。其中,在i=1的情况下,第i个下采样模块的输入数据为目标图像数据,在i大于1的情况下,第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据。
在ISP芯片中图像在不同的模块中以不同的数据格式(比如RAW、RGB、YUV444,YUV420)存储。若图像降噪模型的输入输出是RAW、RGB、YUV444等各通道分辨率相同的格式时,对于第一个下采样模块,将目标图像数据整体作为该第一个下采样模块的输入,对于其它下采样模块,将上一个下采样模块输出的中间下采样特征数据作为该下采样模块的输入数据。各下采样模块均用于提取图像特征得到中间下采样特征数据。
对应的,在一个实施例中,由上采样模型中的各上采样模块对下采样特征数据进行上采样处理,得到上采样特征数据,包括:
对于第i个上采样模块,对第i个上采样模块的输入数据进行上采样处理,得到第i个上采样模块输出的中间上采样特征数据;将最后一个上采样模块输出的中间上采样特征数据作为上采样特征数据。其中,在i=1的情况下,第i个上采样模块的输入数据为下采样特征数据,在i大于1的情况下,第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据。
其中,对于第一个上采样模块,将上采样特征数据作为该第一个上采样模块的输入。对于其它上采样模块,将上一个上采样模块输出的中间上采样特征数据以及与该上采样模块对应的下采样模块输出的中间下采样特征数据进行融合处理得到聚合特征数据,将该聚合特征数据作为该上采样模块的输入数据,由此,可以对目标图像数据中的深层特征和浅层特征、各不同分辨率的特征均可以进行充分融合,提升降噪处理的效果。
对应的,在一个实施例中,由输出层基于上采样特征数据和目标图像数据得到降噪图像数据,包括:将上采样特征数据和目标图像数据输入至输出层进行融合处理,得到输出层输出的降噪图像数据。
具体的,该输出层主要用于对输入数据进行融合处理。换言之,该输出层与下采样模块中的融合模块均用于进行特征融合处理。可选的,输出层可以是对输入数据进行简单的按元素加处理或者按通道叠加处理以实现融合处理。
经过一系列处理,最后一个上采样模块输出的上采样特征数据中包含目标图像的噪声特征数据,将目标图像数据与该上采样特征数据进行融合处理,即可从目标图像数据去除噪声特征,输出层输出的即为去除噪声特征数据的降噪图像数据,实现对目标图像的降噪处理,也即,整个图像降噪模型实际上是输出噪声残差,然后在输出层中将该噪声残差和目标图像做叠加后,获得输出层输出的降噪后的图像。
可选的,该上采样模块可由级联的卷积层和上采样层构成。
基于上文中的内容,可得到本申请实施例提供的一种单输入输出的图像降噪模型。请参考图7,其示出了本申请实施例提供的一种降噪神经网络的结构示意图。示例性的,图7中所示的图像降噪模型包括三个下采样模块、三个上采样模块,输出层和下采样模块中的融合模块以及其它融合处理均以按元素加的融合方式为例,下采样层和上采样层可通过卷积处理实现下采样或者上采样。
可见,本申请实施例提供的基于U-net网络结构的图像降噪模型,其结构简单,且直接对图像各通道的数据整体进行降噪处理,保证了良好的降噪效果且处理效率较高。
在神经网络框架和基础计算单元结构不变的情况下,下采样模块个上采样模块的个数可以根据实际算力和带宽限制修改,可以适当提升降噪效果。图7中以三次下采样和三次上采样为例,实际还可以增 加次数(如4次、5次等)或者减少次数(如2次)。通道融合的方式可以选择按元素加或者通道拼接等。
在另一种情况中,该目标图像为各通道的图像数据分辨率不相同的格式的图像,例如YUV420等格式的目标图像,此时,该图像降噪模型中的下采样模型还包括附加下采样模块。该图像降噪模型中的上采样模型还包括附加上采样模块。请参考图8,其示出了本申请实施例提供的另一种图像降噪模型的结构示意图。
对应的,将目标图像数据输入至下采样模型中,由下采样模型中的各下采样模块对目标图像数据进行下采样处理,得到下采样特征数据,包括:
将目标图像数据中包含的目标图像的第一通道像素值输入至附加下采样模块,得到附加下采样模块输出的通道特征数据。将通道特征数据与目标图像数据中包含的目标图像的第二通道像素值进行融合处理,得到候选目标图像数据作为第一个下采样模块的输入数据。对于第i个下采样模块,对第i个下采样模块的输入数据进行下采样处理,得到第i个下采样模块输出的中间下采样特征数据。将最后一个下采样模块输出的中间下采样特征数据作为下采样特征数据。
其中,在i=1的情况下,第i个下采样模块的输入数据为候选目标图像数据,在i大于1的情况下,第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据。
在ISP芯片中图像在不同的模块中以不同的数据格式(比如RAW、RGB、YUV444,YUV420)存储,若该图像降噪模型嵌入芯片中输入输出的图像的格式为YUV420等各通道分辨率不一致的位置时,则需要使用到多输入多输出网络结构,也即图8中的结构。
具体的,对于各通道分辨率的目标图像,该目标图像对应的目标图像数据由第一通道像素值和第二通道像素值构成,以YUV420格式的目标图像为例,该第一通道像素值即为Y通道像素值,该第二通道像素值即为UV通道像素值。由于分辨率不同,因此,需要先对第一通道像素值也即Y通道像素值采用附加下采样模块预先进行下采样处理,附加下采样模块输出的通道特征数据即可与第二通道像素值也即UV通道像素值分辨率相同,此时,即可将通道特征数据和第二通道像素值直接进行融合处理并采用各下采样模块执行正常的下采样处理。
对应的,在一个实施例中,将下采样特征数据输入至上采样模型中,由上采样模型中的各上采样模块对下采样特征数据进行上采样处理,得到上采样特征数据,包括:
对于第i个上采样模块,对第i个上采样模块的输入数据进行上采样处理,得到第i个上采样模块输出的中间上采样特征数据。将最后一个上采样模块输出的中间上采样特征数据中包括的与第一通道像素值对应的第一中间通道特征数据输入至附加上采样模块中,得到附加上采样模块输出的上采样特征数据。
其中,在i=1的情况下,第i个上采样模块的输入数据为下采样特征数据,在i大于1的情况下,第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据,由此,可以对目标图像数据中的深层特征和浅层特征、各不同分辨率的特征均可以进行充分融合,提升降噪处理的效果。
与包含第一通道像素值和第二通道像素值的目标图像数据对应,该图像降噪模型输出的降噪图像数据也应包括分辨率不同的不同通道的降噪图像数据。
对于上采样模块而言,与前文中的图像降噪模型相同,各上采样模块对输入数据进行上采样处理,得到最后一个上采样模块输出的中间上采样特征数据。此时,该最后一个上采样模块输出的中间上采样特征数据包括第一中间通道特征数据以及第二中间通道特征数据,该第一中间通道特征数据与第一通道像素值也即Y通道图像数据对应,换言之,第一中间通道特征数据是对第一通道像素值降噪处理后得到的;相应的,第二中间通道特征数据是对第二通道像素值降噪处理后得到的。
由于第一通道像素值经过附加下采样模块的下采样处理,对应的,第一中间通道特征数据还需经过 附加上采样模块进行上采样处理,从而将该附加上采样模块的输出作为该上采样模型最终输出的上采样特征数据。以使得输出层基于该上采样特征数据以及目标图像数据中的第一通道像素值进行进一步的融合处理。
对应的,在一个实施例中,由输出层基于上采样特征数据和目标图像数据得到降噪图像数据,包括:将上采样特征数据和目标图像数据中的第一通道像素值输入至输出层中进行融合处理,得到输出层输出的候选降噪图像数据。根据候选降噪图像数据和最后一个上采样模块输出的中间上采样特征数据中包括的与第二通道像素值对应的第二中间通道特征数据得到降噪图像数据。
具体的,经过一系列处理,该上采样特征数据中包含第一通道像素值对应的噪声特征,也即提取到目标图像的噪声残差。通过输出层将该上采样特征数据与第一通道像素值进行融合处理,也即在输出层中将该噪声残差和目标图像做叠加,即可去掉第一通道像素值中的噪声特征,得到输出层输出的候选降噪图像数据。
基于该候选图像降噪数据以及最后一个上采样模块输出的中间上采样特征数据中包括的第二中间通道特征数据即可得到该降噪图像数据。
由此,对于各通道分辨率不同的图像,该多输入输出的图像降噪模型也可以实现降噪处理,扩展了ISP芯片的应用。
可选的,该附加下采样模块由级联的下采样层和卷积层构成,该附加上采样模块由级联的卷积层和上采样层构成。
基于上文的内容,可得到本申请实施例提供的多输入输出的图像降噪模型的结构示意图。请参考图9,其示出了本申请实施例提供的另一种降噪神经网络的结构示意图。示例性的,图9中所示的图像降噪模型包括两个下采样模块、两个上采样模块,输出层和下采样模块中的融合模块以及其它融合处理均以按元素加的融合方式为例,下采样层和上采样层可通过卷积处理实现下采样。
本申请实施例中,图像降噪处理模块作为ISP芯片中的重要模块,输入和输出的格式、数据排布方式,都要和传统的ISP芯片中的降噪处理模块一致,以减小对ISP芯片原始布局的更改,加快应用过程。因此,本申请实施例中提供了多输入输出格式的降噪神经网络,替换原有降噪模块而不需要对ISP芯片的布局做大幅修改。和传统的神经网络相比,本申请实施例中使用了多输入多输出的网络结构,嵌入在ISP芯片中,较好地解决了单图降噪神经网络无法适配ISP芯片布局以及实时性的问题。
与现有ISP芯片中的图像处理算法相比,本申请实施例中的图像降噪模型可以充分融合目标图像数据中各个通道像素值的信息,能更好的挖掘图像中的信息,剔除图像中的噪声。其中,通道融合的ISP降噪算法可以有效的提高图像的信噪比。和传统的降噪算法相比,使用轻量级的降噪神经网络,可以较大程度的提升图像的清晰度、降低噪声。在降低噪声的同时,由于图像质量的提升,同时可以提升图像中运动物体的拖尾噪声,可以提升目标图像后续进行目标检测或者人脸识别等图像任务时的精度,扩展降噪处理后的目标图像的应用范围。
如上文所说,各下采样模块包括第一下采样模块、第二下采样模块以及与第一下采样模块和第二下采样模块均级联的融合模块;第一下采样模块包括第一卷积层和第一下采样层,第二下采样模块包括第二下采样层。下面将对下采样模块的处理过程进行说明。
在一个实施例中,对第i个下采样模块的输入数据进行下采样处理,得到第i个下采样模块输出的中间下采样特征数据,包括:利用第一下采样层对第i个下采样模块的输入数据进行下采样处理,得到第一下采样层输出的第一下采样特征数据;利用第一卷积层对第一下采样特征数据进行卷积处理,得到第一卷积层输出的第一卷积特征数据;利用第二下采样层对第i个下采样模块的输入数据进行下采样处理,得到第二下采样层输出的第二下采样特征数据;利用融合模块对第一卷积特征数据和第二下采样特征数据进行融合处理,得到融合模块输出的中间下采样特征数据。
可选的,第一下采样层和第二下采样层可通过卷积处理实现下采样处理。
可选的,各下采样模块可以选择合适的卷积结构,例如,可选择第一下采样层和第一卷积层为5x5卷积核的卷积处理,第二卷积层为3x3卷积核的卷积处理等。当然,下采样模块中各层的卷积处理可以使用直连、1x1卷积、3x3卷积、5x5卷积、7x7卷积等任意的组合,本申请实施例对此不作具体限定。
下面将对上采样模块的处理过程进行说明。
在一个实施例中,该上采样模块包括级联的第二卷积层和上采样层;对第i个上采样模块的输入数据进行上采样处理,得到第i个上采样模块输出的中间上采样特征数据,包括:利用第二卷积层对第i个上采样模块的输入数据进行卷积处理,得到第二卷积层输出的第二卷积特征数据;利用上采样层对第二卷积特征数据进行上采样处理,得到上采样层输出的中间上采样特征数据。
可选的,上采样层通过卷积处理、反池化处理或者插值处理对上采样层的输入数据进行上采样处理。
可选的,各上采样模块还可以包括其它数量的卷积层和上采样层,具体可基于ISP芯片的算力、带宽或存储能参数确定。
其中,第二卷积层和上采样层可以选择合适的卷积结构,各层的卷积处理可以使用直连、1x1卷积、3x3卷积、5x5卷积、7x7卷积等任意的组合,本申请实施例对此不作具体限定。
在一个实施例中,对于上文中融合模块的融合处理、输出层的融合处理以及图像降噪模型中的其它融合处理中,融合处理可以对图像深层浅层的特征融合,其均可以采用按元素加或者按通道拼接的方式实现该融合处理。请参考图10,其示出了本申请实施例提供的一种按元素加的融合处理的示意图;请参考图11,其示出了本申请实施例提供的一种按通道拼接的融合处理的示意图。
其中,按元素加方式融合特征通道的方式可以显著减少数据读取量和计算量,但也可能会损失掉已经提取出来的部分特征。在芯片算力和缓存足够的情形下则可选择使用通道拼接方式。按通道加方法需要确保两组被融合的特征的分辨率和通道数完全一致,而通道拼接方法不需要两组被融合的特征通道数一致。基于此,对于不同的ISP芯片,选择不同的通道融合方式和上采样方式,在最终降噪效果上有微小差异,但对于一些芯片来说运算时间和效率差别非常大。因此,可基于该ISP芯片预先确定的图像处理需求确定融合处理的方式具体为按元素加或者通道拼接,以充分提升芯片图像处理的效率。
在一个实施例中,本申请实施例提供了一种部署在ISP芯片上的神经网络降噪算法,主要部署过程如下:
步骤1:构造U-Net的神经网络的基础结构,具体的,包括确定网络框架、网络的输入输出格式(RAW、RGB、YUV444等)、下采样和上采样的层数以及各层的采样率。
步骤2:确定网络结构中下采样模块的结构。
步骤3:选择通道融合方式以及上采样方式,这例如通道融合采用按元素加的处理方式,上采样采用转置卷积的处理实现上采样。
步骤4:确定网络嵌入ISP芯片图像处理过程中的具体位置,例如RAW图像降噪模块、RGB图像降噪模块或者YUV图像降噪模块。
步骤5:运行和调试完整的神经网络降噪算法。
进一步的,如步骤1中所述,示例性的,整个网络可以包括3个用于下采样处理的多卷积并行子模块,以及与之对应的用于上采样处理的3个普通卷积层和3个上采样层。在每次下采样前保留原分辨率的特征并在相应的上采样层后融合,使得图像中的深层和浅层特征、各不同分辨率的特征可以充分融合。整个网络一共15层卷积和上采样层,每层卷积后加入激活层,以输入为256x256x3大小的YUV格式图像作为输入数据为例,15层网络输出的数据大小分别为128x128x16、128x128x16、128x128x16、64x64x32、64x64x32、64x64x32、32x32x64、32x32x64、32x32x64、32x32x16、64x64x16、64x64x16、128x128x16、128x128x16、256x256x3(YUV格式输出)。
可选的,步骤1中的下采样层数可以为N层,N大于等于1。在ISP芯片算力和缓存足够的情况下,N可以设置为4或更大的整数。
可选的,步骤1中的下采样率可为N:1,N大于1。N可以设置为3、4等。不同的下采样层也可以使用不同的采样率。
可选的,步骤3中的上采样方式可为反池化或者插值算法等。反池化或者插值方法可以有效减少上采样层的参数量。
与现有ISP芯片中图像处理的算法相比,本申请实施例中,在降噪处理中充分融合了图像各个通道的信息,能更好的挖掘图像中的信息,剔除图像中的噪声。通道融合的ISP降噪算法可以有效的提高图像的信噪比。和传统的降噪算法相比,使用轻量级的降噪神经网络,可以较大程度的提升图像的清晰度、降低噪声。在降低噪声的同时,由于图像质量的提升,同时可以提升图像中运动物体的拖尾噪声,可以提升目标检测、人脸识别等图像任务的精度。和普通的降噪神经网络相比,本申请实施例中使用的神经网络改进了网络结构和基本算子,可以较好地在ISP芯片上实时运行,从而解决了普通神经网络无法在移动设备上实时运行的问题。和普通的神经网络相比,本申请实施例中使用了多输入多输出的网络结构,嵌入在ISP芯片的图像处理流程中的同时,较好地解决了单图降噪神经网络无法适配ISP芯片的图像处理流程的问题。
应该理解的是,虽然如上所述的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,如上所述的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
基于同样的发明构思,本申请实施例还提供了一种用于实现上述所涉及的图像降噪处理方法的图像降噪处理装置。该装置所提供的解决问题的实现方案与上述方法中所记载的实现方案相似,故下面所提供的一个或多个图像降噪处理装置实施例中的具体限定可以参见上文中对于图像降噪处理方法的限定,在此不再赘述。
在一个实施例中,如图12所示,提供了一种图像降噪处理装置,图像降噪处理装置1200包括:降噪模块1201,其中:
降噪模块1201,用于将目标图像数据输入至图像降噪模型中,得到图像降噪模型输出的降噪图像数据,目标图像数据包括目标图像的各个通道的像素值;其中,图像降噪模型包括级联的下采样模型、上采样模型和输出层,下采样模型包括n个级联的下采样模块,上采样模型包括与n个下采样模块一一对应的n个级联的上采样模块;下采样模块包括第一下采样模块、第二下采样模块以及与第一下采样模块和第二下采样模块均级联的融合模块;第一下采样模块包括级联的第一下采样层和第一卷积层,第二下采样模块包括第二下采样层。
在一个实施例中,降噪模块1201,具体用于:将目标图像数据输入至下采样模型中,由下采样模型中的各下采样模块对目标图像数据进行下采样处理,得到下采样特征数据;将下采样特征数据输入至上采样模型中,由上采样模型中的各上采样模块对下采样特征数据进行上采样处理,得到上采样特征数据;由输出层基于上采样特征数据和目标图像数据得到降噪图像数据。
在一个实施例中,目标图像各通道的图像数据分辨率相同,降噪模块1201,具体用于:对于第i个下采样模块,对第i个下采样模块的输入数据进行下采样处理,得到第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,第i个下采样模块的输入数据为目标图像数据,在i大于1的情况下,第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个下采样模块输出的中间下采样特征数据作为下采样特征数据。
在一个实施例中,降噪模块1201,具体用于:对于第i个上采样模块,对第i个上采样模块的输入数据进行上采样处理,得到第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,第 i个上采样模块的输入数据为下采样特征数据,在i大于1的情况下,第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个上采样模块输出的中间上采样特征数据作为上采样特征数据。
在一个实施例中,降噪模块1201,具体用于:将上采样特征数据和目标图像数据输入至输出层进行融合处理,得到输出层输出的降噪图像数据。
在一个实施例中,目标图像各通道的图像数据分辨率不同,下采样模型还包括附加下采样模块,降噪模块1201,具体用于:将目标图像数据中包含的目标图像的第一通道像素值输入至附加下采样模块,得到附加下采样模块输出的通道特征数据;将通道特征数据与目标图像数据中包含的目标图像的第二通道像素值进行融合处理,得到候选目标图像数据;对于第i个下采样模块,对第i个下采样模块的输入数据进行下采样处理,得到第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,第i个下采样模块的输入数据为候选目标图像数据,在i大于1的情况下,第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个下采样模块输出的中间下采样特征数据作为下采样特征数据。
在一个实施例中,上采样模型还包括附加上采样模块,降噪模块1201,具体用于:对于第i个上采样模块,对第i个上采样模块的输入数据进行上采样处理,得到第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,第i个上采样模块的输入数据为下采样特征数据,在i大于1的情况下,第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个上采样模块输出的中间上采样特征数据中包括的与第一通道像素值对应的第一中间通道特征数据输入至附加上采样模块中,得到附加上采样模块输出的上采样特征数据。
在一个实施例中,降噪模块1201,具体用于:将上采样特征数据和目标图像数据中的第一通道像素值输入至输出层中进行融合处理,得到输出层输出的候选降噪图像数据;根据候选降噪图像数据和最后一个上采样模块输出的中间上采样特征数据中包括的与第二通道像素值对应的第二中间通道特征数据得到降噪图像数据。
在一个实施例中,降噪模块1201,具体用于:利用第一下采样层对第i个下采样模块的输入数据进行下采样处理,得到第一下采样层输出的第一下采样特征数据;利用第一卷积层对第一下采样特征数据进行卷积处理,得到第一卷积层输出的第一卷积特征数据;利用第二下采样层对第i个下采样模块的输入数据进行下采样处理,得到第二下采样层输出的第二下采样特征数据;利用融合模块对第一卷积特征数据和第二下采样特征数据进行融合处理,得到融合模块输出的中间下采样特征数据。
在一个实施例中,上采样模块包括级联的第二卷积层和上采样层;降噪模块1201,具体用于:利用第二卷积层对第i个上采样模块的输入数据进行卷积处理,得到第二卷积层输出的第二卷积特征数据;利用上采样层对第二卷积特征数据进行上采样处理,得到上采样层输出的中间上采样特征数据。
在一个实施例中,图像降噪模型用于ISP芯片中的RAW图像降噪模块1201、RGB图像降噪模块1201或者YUV图像降噪模块1201中;对应的,目标图像的格式为RAW格式、RGB格式或者YUV格式。
在一个实施例中,上采样层通过卷积处理、反池化处理或者插值处理对上采样层的输入数据进行上采样处理。
上述图像降噪处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图13 所示。该计算机设备包括通过系统总线连接的处理器、存储器、通信接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的通信接口用于与外部的终端进行有线或无线方式的通信,无线方式可通过WIFI、移动蜂窝网络、NFC(近场通信)或其他技术实现。该计算机程序被处理器执行时以实现一种图像降噪处理方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图13中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,在一个实施例中,还提供了一种电子设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现上述各方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存、光存储器、高密度嵌入式非易失性存储器、阻变存储器(ReRAM)、磁变存储器(Magnetoresistive Random Access Memory,MRAM)、铁电存储器(Ferroelectric Random Access Memory,FRAM)、相变存储器(Phase Change Memory,PCM)、石墨烯存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器等。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。本申请所提供的各实施例中所涉及的数据库可包括关系型数据库和非关系型数据库中至少一种。非关系型数据库可包括基于区块链的分布式数据库等,不限于此。本申请所提供的各实施例中所涉及的处理器可为通用处理器、中央处理器、图形处理器、数字信号处理器、可编程逻辑器、基于量子计算的数据处理逻辑器等,不限于此。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种图像降噪处理方法,其特征在于,所述方法包括:
    将目标图像数据输入至图像降噪模型中,得到所述图像降噪模型输出的降噪图像数据,所述目标图像数据包括目标图像的各个通道的像素值;
    其中,所述图像降噪模型包括级联的下采样模型、上采样模型和输出层,所述下采样模型包括n个级联的下采样模块,所述上采样模型包括与n个所述下采样模块一一对应的n个级联的上采样模块;所述下采样模块包括第一下采样模块、第二下采样模块以及与所述第一下采样模块和所述第二下采样模块均级联的融合模块;所述第一下采样模块包括级联的第一下采样层和第一卷积层,所述第二下采样模块包括第二下采样层。
  2. 根据权利要求1所述的方法,其特征在于,所述将目标图像数据输入至图像降噪模型中,得到所述图像降噪模型输出的降噪图像数据,包括:
    将所述目标图像数据输入至所述下采样模型中,由所述下采样模型中的各所述下采样模块对所述目标图像数据进行下采样处理,得到下采样特征数据;
    将所述下采样特征数据输入至所述上采样模型中,由所述上采样模型中的各所述上采样模块对所述下采样特征数据进行上采样处理,得到上采样特征数据;
    由所述输出层基于所述上采样特征数据和所述目标图像数据得到所述降噪图像数据。
  3. 根据权利要求2所述的方法,其特征在于,所述目标图像各通道的图像数据分辨率相同,所述由所述下采样模型中的各所述下采样模块对所述目标图像数据进行下采样处理,得到下采样特征数据,包括:
    对于第i个下采样模块,对所述第i个下采样模块的输入数据进行下采样处理,得到所述第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,所述第i个下采样模块的输入数据为所述目标图像数据,在i大于1的情况下,所述第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;
    将最后一个所述下采样模块输出的中间下采样特征数据作为所述下采样特征数据。
  4. 根据权利要求3所述的方法,其特征在于,所述由所述上采样模型中的各所述上采样模块对所述下采样特征数据进行上采样处理,得到上采样特征数据,包括:
    对于第i个上采样模块,对所述第i个上采样模块的输入数据进行上采样处理,得到所述第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,所述第i个上采样模块的输入数据为所述下采样特征数据,在i大于1的情况下,所述第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与所述第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;
    将最后一个所述上采样模块输出的中间上采样特征数据作为所述上采样特征数据。
  5. 根据权利要求4所述的方法,其特征在于,所述由所述输出层基于所述上采样特征数据和所述目标图像数据得到所述降噪图像数据,包括:
    将所述上采样特征数据和所述目标图像数据输入至所述输出层进行融合处理,得到所述输出层输出的所述降噪图像数据。
  6. 根据权利要求2所述的方法,其特征在于,所述目标图像各通道的图像数据分辨率不同,所述下采样模型还包括附加下采样模块,所述将所述目标图像数据输入至所述下采样模型中,由所述下采样模型中的各所述下采样模块对所述目标图像数据进行下采样处理,得到下采样特征数据,包括:
    将所述目标图像数据中包含的所述目标图像的第一通道像素值输入至所述附加下采样模块,得到所述附加下采样模块输出的通道特征数据;
    将所述通道特征数据与所述目标图像数据中包含的所述目标图像的第二通道像素值进行融合处理,得到候选目标图像数据;
    对于第i个下采样模块,对所述第i个下采样模块的输入数据进行下采样处理,得到所述第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,所述第i个下采样模块的输入数据为所述候选目标图像数据,在i大于1的情况下,所述第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;
    将最后一个所述下采样模块输出的中间下采样特征数据作为所述下采样特征数据。
  7. 根据权利要求6所述的方法,其特征在于,所述上采样模型还包括附加上采样模块,所述将所述下采样特征数据输入至所述上采样模型中,由所述上采样模型中的各所述上采样模块对所述下采样特征数据进行上采样处理,得到上采样特征数据,包括:
    对于第i个上采样模块,对所述第i个上采样模块的输入数据进行上采样处理,得到所述第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,所述第i个上采样模块的输入数据为所述下采样特征数据,在i大于1的情况下,所述第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与所述第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;
    将最后一个上采样模块输出的中间上采样特征数据中包括的与所述第一通道像素值对应的第一中间通道特征数据输入至所述附加上采样模块中,得到所述附加上采样模块输出的所述上采样特征数据。
  8. 根据权利要求7所述的方法,其特征在于,所述由所述输出层基于所述上采样特征数据和所述目标图像数据得到所述降噪图像数据,包括:
    将所述上采样特征数据和所述目标图像数据中的所述第一通道像素值输入至所述输出层中进行融合处理,得到所述输出层输出的候选降噪图像数据;
    根据所述候选降噪图像数据和所述最后一个上采样模块输出的中间上采样特征数据中包括的与所述第二通道像素值对应的第二中间通道特征数据得到所述降噪图像数据。
  9. 根据权利要求3或6任一所述的方法,其特征在于,所述对所述第i个下采样模块的输入数据进行下采样处理,得到所述第i个下采样模块输出的中间下采样特征数据,包括:
    利用所述第一下采样层对所述第i个下采样模块的输入数据进行下采样处理,得到所述第一下采样层输出的第一下采样特征数据;
    利用所述第一卷积层对所述第一下采样特征数据进行卷积处理,得到所述第一卷积层输出的第一卷积特征数据;
    利用所述第二下采样层对所述第i个下采样模块的输入数据进行下采样处理,得到所述第二下采样层输出的第二下采样特征数据;
    利用所述融合模块对所述第一卷积特征数据和所述第二下采样特征数据进行融合处理,得到所述融合模块输出的所述中间下采样特征数据。
  10. 根据权利要求4或7所述的方法,其特征在于,所述上采样模块包括级联的第二卷积层和上采样层;所述对所述第i个上采样模块的输入数据进行上采样处理,得到所述第i个上采样模块输出的中间上采样特征数据,包括:
    利用所述第二卷积层对所述第i个上采样模块的输入数据进行卷积处理,得到所述第二卷积层输出的第二卷积特征数据;
    利用所述上采样层对所述第二卷积特征数据进行上采样处理,得到所述上采样层输出的所述中间上采样特征数据。
  11. 根据权利要求1所述的方法,其特征在于,所述图像降噪模型用于ISP芯片中的RAW图像降噪模块、RGB图像降噪模块或者YUV图像降噪模块中;对应的,所述目标图像的格式为RAW格式、 RGB格式或者YUV格式。
  12. 根据权利要求10所述的方法,其特征在于,所述上采样层通过卷积处理、反池化处理或者插值处理对所述上采样层的输入数据进行上采样处理。
  13. 一种图像降噪处理装置,其特征在于,所述装置包括:
    降噪模块,用于将目标图像数据输入至图像降噪模型中,得到所述图像降噪模型输出的降噪图像数据,所述目标图像数据包括目标图像的各个通道的像素值;
    其中,所述图像降噪模型包括级联的下采样模型、上采样模型和输出层,所述下采样模型包括n个级联的下采样模块,所述上采样模型包括与n个所述下采样模块一一对应的n个级联的上采样模块;所述下采样模块包括第一下采样模块、第二下采样模块以及与所述第一下采样模块和所述第二下采样模块均级联的融合模块;所述第一下采样模块包括级联的第一下采样层和第一卷积层,所述第二下采样模块包括第二下采样层。
  14. 根据权利要求13所述的装置,其特征在于,降噪模块,具体用于:
    将所述目标图像数据输入至所述下采样模型中,由所述下采样模型中的各下采样模块对所述目标图像数据进行下采样处理,得到下采样特征数据;将所述下采样特征数据输入至所述上采样模型中,由所述上采样模型中的各上采样模块对所述下采样特征数据进行上采样处理,得到上采样特征数据;由所述输出层基于所述上采样特征数据和所述目标图像数据得到所述降噪图像数据。
  15. 根据权利要求14所述的装置,其特征在于,目标图像各通道的图像数据分辨率相同,所述降噪模块,具体用于:
    对于第i个下采样模块,对所述第i个下采样模块的输入数据进行下采样处理,得到所述第i个下采样模块输出的中间下采样特征数据;其中,在i=1的情况下,所述第i个下采样模块的输入数据为所述目标图像数据,在i大于1的情况下,所述第i个下采样模块的输入数据为第i-1个下采样模块输出的中间下采样特征数据;将最后一个所述下采样模块输出的中间下采样特征数据作为所述下采样特征数据。
  16. 根据权利要求15所述的装置,其特征在于,所述降噪模块,具体用于:
    对于第i个上采样模块,对所述第i个上采样模块的输入数据进行上采样处理,得到所述第i个上采样模块输出的中间上采样特征数据;其中,在i=1的情况下,所述第i个上采样模块的输入数据为所述下采样特征数据,在i大于1的情况下,所述第i个上采样模块的输入数据为第i-1个上采样模块输出的中间上采样特征数据和与所述第i个上采样模块对应的下采样模块输出的中间下采样特征数据融合处理得到的聚合特征数据;将最后一个所述上采样模块输出的中间上采样特征数据作为所述上采样特征数据。
  17. 根据权利要求16所述的装置,其特征在于,所述降噪模块,具体用于:
    将所述上采样特征数据和所述目标图像数据输入至所述输出层进行融合处理,得到所述输出层输出的所述降噪图像数据。
  18. 一种电子设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至12中任一项所述的方法的步骤。
  19. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至12中任一项所述的方法的步骤。
  20. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至12中任一项所述的方法的步骤。
PCT/CN2022/138842 2022-09-16 2022-12-14 图像降噪处理方法、装置、设备、存储介质和程序产品 Ceased WO2024055458A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP22958645.8A EP4535279A4 (en) 2022-09-16 2022-12-14 IMAGE NOISE REDUCTION PROCESSING METHOD AND APPARATUS, DEVICE, STORAGE MEDIUM, AND PROGRAM PRODUCT
JP2024569570A JP7826519B2 (ja) 2022-09-16 2022-12-14 画像ノイズ低減処理方法、装置、デバイス、記憶媒体及びプログラム製品
US18/992,375 US20250390989A1 (en) 2022-09-16 2022-12-14 Image noise reduction processing method and apparatus, device, storage medium, and program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211128666.6 2022-09-16
CN202211128666.6A CN115471417B (zh) 2022-09-16 2022-09-16 图像降噪处理方法、装置、设备、存储介质和程序产品

Publications (1)

Publication Number Publication Date
WO2024055458A1 true WO2024055458A1 (zh) 2024-03-21

Family

ID=84333965

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/138842 Ceased WO2024055458A1 (zh) 2022-09-16 2022-12-14 图像降噪处理方法、装置、设备、存储介质和程序产品

Country Status (5)

Country Link
US (1) US20250390989A1 (zh)
EP (1) EP4535279A4 (zh)
JP (1) JP7826519B2 (zh)
CN (1) CN115471417B (zh)
WO (1) WO2024055458A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471417B (zh) * 2022-09-16 2025-07-15 广州安凯微电子股份有限公司 图像降噪处理方法、装置、设备、存储介质和程序产品
CN116452801B (zh) * 2023-03-14 2026-03-31 苏州国科康成医疗科技有限公司 一种多模态图像分割方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111199516A (zh) * 2019-12-30 2020-05-26 深圳大学 基于图像生成网络模型的图像处理方法、系统及存储介质
CN113344827A (zh) * 2021-08-05 2021-09-03 浙江华睿科技股份有限公司 一种图像去噪方法、图像去噪网络运算单元及设备
US20220147732A1 (en) * 2020-11-11 2022-05-12 Beijing Boe Optoelectronics Technology Co., Ltd. Object recognition method and system, and readable storage medium
CN114913094A (zh) * 2022-06-07 2022-08-16 中国工商银行股份有限公司 图像修复方法、装置、计算机设备、存储介质和程序产品
CN115471417A (zh) * 2022-09-16 2022-12-13 广州安凯微电子股份有限公司 图像降噪处理方法、装置、设备、存储介质和程序产品

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905692A (zh) * 2012-12-26 2014-07-02 苏州赛源微电子有限公司 一种基于运动检测的简易3d降噪算法
CN109064414B (zh) * 2018-07-06 2020-11-10 维沃移动通信有限公司 一种图像去噪方法及装置
BR112020022560A2 (pt) * 2018-09-30 2021-06-01 Boe Technology Group Co., Ltd. aparelho e método para processamento de imagens e sistema para rede neural de treinamento
CN111192215B (zh) * 2019-12-30 2023-08-29 百度时代网络技术(北京)有限公司 图像处理方法、装置、设备和可读存储介质
CN112381741B (zh) * 2020-11-24 2021-07-16 佛山读图科技有限公司 基于spect数据采样与噪声特性的断层图像重建方法
CN114862685B (zh) * 2021-01-19 2025-07-08 杭州海康威视数字技术股份有限公司 一种图像降噪方法、及图像降噪模组
CN114302026B (zh) * 2021-12-28 2024-06-21 维沃移动通信有限公司 降噪方法、装置、电子设备和可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111199516A (zh) * 2019-12-30 2020-05-26 深圳大学 基于图像生成网络模型的图像处理方法、系统及存储介质
US20220147732A1 (en) * 2020-11-11 2022-05-12 Beijing Boe Optoelectronics Technology Co., Ltd. Object recognition method and system, and readable storage medium
CN113344827A (zh) * 2021-08-05 2021-09-03 浙江华睿科技股份有限公司 一种图像去噪方法、图像去噪网络运算单元及设备
CN114913094A (zh) * 2022-06-07 2022-08-16 中国工商银行股份有限公司 图像修复方法、装置、计算机设备、存储介质和程序产品
CN115471417A (zh) * 2022-09-16 2022-12-13 广州安凯微电子股份有限公司 图像降噪处理方法、装置、设备、存储介质和程序产品

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4535279A4

Also Published As

Publication number Publication date
JP7826519B2 (ja) 2026-03-09
US20250390989A1 (en) 2025-12-25
EP4535279A4 (en) 2025-09-24
CN115471417B (zh) 2025-07-15
JP2025517801A (ja) 2025-06-10
CN115471417A (zh) 2022-12-13
EP4535279A1 (en) 2025-04-09

Similar Documents

Publication Publication Date Title
CN113781320B (zh) 一种图像处理方法、装置、终端设备及存储介质
CN112602088B (zh) 提高弱光图像的质量的方法、系统和计算机可读介质
CN112889069B (zh) 用于提高低照度图像质量的方法、系统和计算机可读介质
KR20210114856A (ko) 딥 컨볼루션 신경망을 이용한 이미지 노이즈 제거 시스템 및 방법
CN116051428B (zh) 一种基于深度学习的联合去噪与超分的低光照图像增强方法
US20180315174A1 (en) Apparatus and methods for artifact detection and removal using frame interpolation techniques
CN113781345B (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
CN112150400A (zh) 图像增强方法、装置和电子设备
Xu et al. Image demoireing in raw and srgb domains
CN111260580A (zh) 一种基于图像金字塔的图像去噪方法、计算机装置及计算机可读存储介质
WO2024055458A1 (zh) 图像降噪处理方法、装置、设备、存储介质和程序产品
Liu et al. Learning noise-decoupled affine models for extreme low-light image enhancement
Park et al. Color filter array demosaicking using densely connected residual network
US12354243B2 (en) Image denoising method, device, and computer-readable medium using U-net
Tsutsui et al. An fpga implementation of real-time retinex video image enhancement
CN112070676A (zh) 一种双通道多感知卷积神经网络的图片超分辨率重建方法
CN111815546A (zh) 图像重建方法以及相关设备、装置
CN117726564A (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
CN111667430B (zh) 图像的处理方法、装置、设备以及存储介质
CN115829878A (zh) 一种图像增强方法及装置
CN104243767A (zh) 去除图像噪声的方法
CN116309183A (zh) 一种图像处理方法、装置、设备及可读存储介质
CN117522742B (zh) 图像处理方法、架构、装置和计算机设备
Janardhan et al. FPGA implementation of low complexity super resolution scaling architecture for UHD display systems
Bui-Thu et al. An efficient approach based on Bayesian MAP for video super-resolution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22958645

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2024569570

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2022958645

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 18992375

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2022958645

Country of ref document: EP

Effective date: 20250106

WWE Wipo information: entry into national phase

Ref document number: 202537010973

Country of ref document: IN

WWP Wipo information: published in national office

Ref document number: 202537010973

Country of ref document: IN

WWP Wipo information: published in national office

Ref document number: 2022958645

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE