WO2023124123A1 - 图像处理方法及其相关设备 - Google Patents

图像处理方法及其相关设备 Download PDF

Info

Publication number
WO2023124123A1
WO2023124123A1 PCT/CN2022/113424 CN2022113424W WO2023124123A1 WO 2023124123 A1 WO2023124123 A1 WO 2023124123A1 CN 2022113424 W CN2022113424 W CN 2022113424W WO 2023124123 A1 WO2023124123 A1 WO 2023124123A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
exposure
network model
style transfer
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/113424
Other languages
English (en)
French (fr)
Inventor
陈珂
肖斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to EP22913408.5A priority Critical patent/EP4340383B1/en
Priority to US18/574,139 priority patent/US20240320794A1/en
Publication of WO2023124123A1 publication Critical patent/WO2023124123A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4015Image demosaicing, e.g. colour filter arrays [CFA] or Bayer patterns
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/92Dynamic range modification of images or parts thereof based on global image properties
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/24Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10141Special mode during image acquisition
    • G06T2207/10144Varying exposure
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present application relates to the field of image processing, in particular to an image processing method and related equipment.
  • the present application provides an image processing method and related equipment, which can perform color correction on dark areas of images collected from scenes with low illumination or from scenes with high dynamic range, so as to improve quality and thereby improve user experience.
  • an image processing method comprising:
  • Displaying a first interface the first interface including a first control; detecting a first operation on the first control; in response to the first operation, acquiring a multi-frame exposure image, the exposure of the multi-frame exposure image
  • the time is different, the multiple frames of the exposure images include at least one frame of the first long exposure image, and the exposure time of the first long exposure image is longer than the exposure time of other exposure images in the multiple frames of the exposure images;
  • the exposed image of the frame is subjected to style transfer processing to obtain the target image.
  • the embodiment of the present application provides an image processing method, by performing style transfer processing on multi-frame exposure images, it is possible to perform color correction on scenes with low illumination, or dark areas of images collected from scenes with high dynamic range , improve the quality, thereby improving the user experience.
  • the method before performing style transfer processing on multiple frames of the exposed images to obtain the target image, the method further includes: using a deep learning network model to perform style transfer processing on the multiple frames of the exposed images processing to obtain a first fused image; performing first back-end processing on the first fused image to obtain a first back-end image.
  • the deep learning network model can perform at least one of noise reduction and demosaicing, and can also perform multi-exposure fusion and other processing.
  • the embodiment of the present application implements noise reduction, demosaicing, and multi-exposure fusion through a deep learning network model, avoiding the mutual influence between different processes when multiple processes are performed in series, and the The accumulation of errors brought about improves the effect of image detail recovery.
  • the method further includes: performing second back-end processing on the first long-exposure image to obtain a second back-end image.
  • the second back-end image corresponding to the first long-exposure image can be subsequently used for style transfer processing. At this time, the amount of data is less.
  • the method further includes: acquiring a second long-exposure image, where the exposure time of the second long-exposure image is different from that of the multiple frames of the exposure images except for the first long exposure time.
  • the exposure time of other exposure images other than the exposure image is longer; the second back-end processing is performed on the second long-exposure image to obtain a second back-end image.
  • an additional long-exposure image with a longer exposure time is acquired, so that the second back-end image corresponding to the additionally acquired second long-exposure image can be subsequently used for style transfer processing.
  • performing style transfer processing on multiple frames of the exposed images to obtain a target image includes: using a target style transfer network model to process the first back-end image and the second The back-end image is subjected to style transfer processing to obtain the target image.
  • the corresponding second back-end image is determined according to the first long exposure image, or by additionally collecting the second long exposure image, the corresponding second back end image is determined according to the second long exposure image
  • the second back-end image and then use the target style transfer network model to transfer the style corresponding to the second back-end image to the first back-end image, so as to realize the color correction of the dark area of the target image.
  • the method further includes: judging whether the ambient brightness value corresponding to the first back-end image is smaller than a preset ambient brightness value; if not, setting the first back-end An image is output as the target image; if yes, the target style transfer network model is used to perform style transfer processing on the first back-end image and the second back-end image to obtain the target image.
  • using the target style transfer network model to process the first back-end image and the second back-end image to obtain the target image includes: using the target A style transfer network model, processing the first back-end image and the second back-end image to obtain a first style transformation matrix; upsampling the first style transformation matrix to obtain a second style transformation matrix; Determining a mask image corresponding to the second back-end image; performing fusion processing on the first back-end image, the second style transformation matrix, and the mask image to obtain the target image.
  • the first style transformation matrix is used to represent the amount of chromaticity deviation between two frames of images input to the target style transfer network model.
  • it is used to indicate the amount of chromaticity deviation between the first backend image and the second backend image.
  • the chromaticity deviation can be determined through the target style transfer network model, therefore, the chromaticity deviation between the first back-end image and the second back-end image can be determined by using the target style transfer network model, and then Combined with the mask image, the style transfer is performed directly on the first back-end image.
  • using the target style transfer network model to process the first back-end image and the second back-end image to obtain a first style transformation matrix includes: using the The target style transfer network model processes the first back-end image and the second back-end image to obtain a chroma deviation coefficient; according to the first back-end image and the chroma deviation coefficient, determine the The first style transformation matrix.
  • the chromaticity deviation coefficient is used to represent the chromaticity deviation magnitude between two frames of images input to the target style transfer network model.
  • the chromaticity deviation coefficient is used to represent the magnitude of chromaticity deviation between the first backend image and the second backend image.
  • the chromaticity deviation coefficient, the brightness value of the first back-end image and the chromaticity deviation amount have a functional mapping relationship.
  • the chromaticity deviation coefficient can be determined through the target style transfer network model, therefore, the chromaticity deviation coefficient between the first back-end image and the second back-end image can be determined by using the target style transfer network model, combined with The first back-end image and the chromaticity deviation coefficient indirectly determine the first style transformation matrix, and then combine the mask image to perform style transfer on the first back-end image.
  • the method further includes: converting the first back-end image and the second back-end image from the YUV domain to the RGB domain respectively; using the target style transfer network
  • the model processes the first back-end image and the second back-end image, including: using the target style transfer network model to convert the first back-end image and the first back-end image after the YUV domain to RGB domain conversion
  • the second backend image is processed.
  • the target style transfer network model is trained and generated based on training images located in the RGB domain, the input first back-end image and second back-end image need to be converted into the RGB domain during processing.
  • the method further includes: using multiple pairs of training images to train the initial style transfer network model, and determining the target style transfer network model; wherein, multiple pairs of the training images are located in the RGB domain, and the contents of each pair of training images are the same but the colors corresponding to the dark areas are different.
  • the initial style network model is trained by using training images in the RGB domain, so that the trained target style transfer network model can be used to determine the first style transformation matrix between images in the RGB domain Or chromaticity deviation coefficient.
  • the method further includes: using multiple pairs of training images to train the initial style transfer network model, and determining the target style transfer network model; wherein, multiple pairs of the training images Both are located in the YUV domain, and the contents of each pair of training images are the same but the colors corresponding to the dark areas are different.
  • the initial style network model is trained by using training images in the YUV domain, so that the trained target style transfer network model can be used to determine the first style transformation matrix between images in the YUV domain Or chromaticity deviation coefficient.
  • multiple pairs of training images are used to train the initial style transfer network model
  • determining the target style transfer network model includes: The training image is processed by using a feature network model to obtain the corresponding feature layer; the initial style transfer network model is trained by using two feature layers corresponding to each pair of the training images to obtain the target style transfer network Model.
  • feature extraction may be performed on 2 frames of training images of each pair of training images separately.
  • multiple pairs of training images are used to train the initial style transfer network model
  • determining the target style transfer network model includes: combining the two frames included in each pair of training images
  • the training image is spliced to obtain the splicing training image
  • the feature network model is used to process the splicing training image to obtain the corresponding splicing feature layer
  • the initial style transfer network model is trained using the splicing feature layer to obtain the splicing feature layer Describe the target style transfer network model.
  • feature extraction is performed on 2 frames of training images of each pair of training images.
  • the feature extraction network model is any one of a resnet model, a vgg model, and a mobilenet.
  • the deep learning network model is any one of a Unet model, an LLnet model and an FCN model.
  • the first back-end processing includes: converting RGB domain to YUV domain.
  • the first back-end processing further includes: at least one of dynamic range adjustment and tone mapping.
  • the second back-end processing includes: converting the RAW domain to the YUV domain.
  • the target style transfer network model is any one of a Resnet model, a vgg model, a unet model, and a vnet model.
  • an image processing apparatus in a second aspect, includes a unit for performing each step in the above first aspect or any possible implementation manner of the first aspect.
  • an electronic device including: one or more processors and memory;
  • the memory is coupled with one or more processors, and the memory is used to store computer program codes, the computer program codes include computer instructions, and the one or more processors call the computer instructions to make the electronic device execute any of the first aspect or the first aspect.
  • a chip including: a processor, configured to call and run a computer program from a memory, so that a device installed with the chip executes the method provided in the first aspect or any possible implementation manner of the first aspect.
  • a computer-readable storage medium stores a computer program.
  • the computer program includes program instructions. Steps of performing processing in the image processing method provided in any possible implementation manner of the aspect.
  • a computer program product in a sixth aspect, includes a computer-readable storage medium storing a computer program, and the computer program enables the computer to execute the image provided in the first aspect or any possible implementation manner of the first aspect The step in the processing method that performs the processing.
  • FIG. 1 is an application scenario provided by an embodiment of the present application
  • FIG. 2 is a schematic flow diagram of an image processing method provided in an embodiment of the present application.
  • FIG. 3 shows two exposure methods provided by the embodiment of the present application
  • FIG. 4 is a schematic flowchart of a first back-end processing provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a second back-end processing provided by an embodiment of the present application.
  • FIG. 6 is a schematic flow diagram of a training target style transfer network model provided by an embodiment of the present application.
  • FIG. 7 is a schematic flow diagram of another training target style transfer network model provided by the embodiment of the present application.
  • Fig. 8 is a schematic flowchart of another image processing method provided by the embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a style migration process provided by an embodiment of the present application.
  • FIG. 10 is a schematic flow chart of another style migration process provided by the embodiment of the present application.
  • Figure 11 is a schematic diagram of an effect provided by the embodiment of the present application.
  • FIG. 12 is a schematic diagram of a hardware system applicable to an electronic device of the present application.
  • Fig. 13 is a schematic diagram of a software system applicable to the electronic device of the present application.
  • FIG. 14 is a schematic structural diagram of an image processing device provided in an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a chip provided by the embodiment of the application.
  • a relationship means that there may be three kinds of relationships, for example, A and/or B means: A exists alone, A and B exist simultaneously, and B exists independently.
  • plural refers to two or more than two.
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features. In the description of this embodiment, unless otherwise specified, “plurality” means two or more.
  • the RGB (red, green, blue) color space refers to a color model related to the structure of the human visual system. According to the structure of the human eye, all colors are seen as different combinations of red, green and blue.
  • YUV color space refers to a color coding method
  • Y represents brightness (luminance or luma)
  • U and V represent chroma (chrominance or chroma).
  • RGB color space focuses on the human eye's perception of color
  • the YUV color space focuses on the sensitivity of vision to brightness.
  • RGB color space and YUV color space can be converted to each other.
  • the pixel value refers to a group of color components corresponding to each pixel in the color image located in the RGB color space.
  • each pixel corresponds to a group of three primary color components, wherein the three primary color components are red component R, green component G and blue component B respectively.
  • a Bayer image that is, an image output by an image sensor based on a Bayer format color filter array.
  • the pixels of multiple colors in this image are arranged in a Bayer pattern.
  • each pixel in the Bayer format image only corresponds to a channel signal of one color.
  • green pixels pixels corresponding to the green channel signal
  • blue pixels pixels corresponding to the blue channel signal
  • red pixels Pigels corresponding to the red channel signal
  • the minimum repeating unit of the Bayer format image is: one red pixel, two green pixels and one blue pixel are arranged in a 2 ⁇ 2 manner.
  • the images arranged in the Bayer format can be considered to be in the RAW domain.
  • Shooting parameters, shooting parameters may include shutter, exposure time, aperture value (AV), exposure value (exposure value, EV) and sensitivity ISO. Introduce respectively below.
  • the shutter is a device that controls the length of time that light enters the camera to determine the exposure time of the image. The longer the shutter is left open, the more light enters the camera and the longer the corresponding exposure time for the image. Conversely, the shorter the shutter remains open, the less light enters the camera and the shorter the exposure time for the image.
  • Exposure time refers to the time for the shutter to be open in order to project light onto the photosensitive surface of the photosensitive material of the camera.
  • the exposure time is determined by the sensitivity of the photosensitive material and the illuminance on the photosensitive surface. The longer the exposure time, the more light enters the camera, and the shorter the exposure time, the less light enters the camera. Therefore, a long exposure time is required in a dark scene, and a short exposure time is required in a backlit scene.
  • the aperture value is the ratio of the focal length of the lens in the camera to the diameter of the lens. The larger the aperture value, the more light enters the camera. The smaller the aperture value, the less light enters the camera.
  • Exposure value is a value that combines the exposure time and aperture value to represent the light-transmitting ability of the camera lens.
  • the exposure value can be defined as:
  • N is the aperture value
  • t is the exposure time in seconds.
  • ISO a measure of how sensitive a film is to light, known as sensitivity or gain.
  • sensitivity or gain a measure of how sensitive a film is to light.
  • insensitive negatives longer exposure times are required to achieve the same brightness as sensitive negatives.
  • sensitive negatives shorter exposure times are required to achieve the same brightness as for insensitive negatives.
  • the electronic device can realize auto focus (AF), auto exposure (automatic exposure, AE), auto white balance (auto white balance, AWB) through algorithms. ) to realize the automatic adjustment of these shooting parameters.
  • AF auto focus
  • AE auto exposure
  • AWB auto white balance
  • FIG. 1 shows an application scenario to which this embodiment of the present application is applicable.
  • the embodiment of the present application provides an image processing method, on the basis of the first fused image determined by the relevant image processing method, combined with a frame of long-exposure image with a long exposure time, and performing the second fusion At the same time, the style corresponding to the long-exposure image is transferred to the first fused image, and the dark area of the first fused image is color corrected, so as to obtain a target image closer to the real scene color.
  • FIG. 1 is an example of an application scenario, and does not impose any limitation on the application scenario of the present application.
  • the image processing method provided in the embodiment of this application can be applied but not limited to the following scenarios:
  • FIG. 2 shows a flowchart of an image processing method provided by an embodiment of the present application. The method is applied to electronic equipment.
  • the image processing method provided in the embodiment of the present application may include the following S110 to S150. These steps are described in detail below.
  • the exposure times of the multi-frame exposure images are different, and the multi-frame exposure images include at least one frame of the first long-exposure image.
  • the exposure time of the first long exposure image is longer than that of other exposure images in the multi-frame exposure images.
  • the different exposure times of the multi-frame exposure images means that the exposure times corresponding to at least two frames of the multi-frame exposure images are different, or the exposure times corresponding to each frame of the multi-frame exposure images are different.
  • an exposure image with a relatively long exposure time can be called a long exposure image
  • an exposure image with a relatively short exposure time can be called a short exposure image
  • the exposure time is between the time corresponding to the long exposure image and the time corresponding to the short exposure image
  • the image of can be called normal exposure image.
  • the long exposure image includes the above-mentioned first long exposure image.
  • the exposure times corresponding to the multiple frames of first long-exposure images may be the same or different, which is not limited in this embodiment of the present application.
  • the electronic device may include one or more image sensors, then the electronic device may control the one or more image sensors to take pictures, so as to obtain multi-frame exposure images.
  • the electronic device can acquire multi-frame exposure images from local storage or from other devices. For example, the user can obtain multi-frame exposure images through the first electronic device D1, and then send the multi-frame exposure images to the second electronic device D2. After receiving the multi-frame exposure images, the second electronic device D2 can execute this Apply the image processing method provided in the embodiment to perform image processing.
  • the electronic device may also acquire multi-frame exposure images in other ways, which is not limited in this embodiment of the present application.
  • the multi-frame exposure image may be an exposure image directly generated by an image sensor, or may be an image obtained after performing one or more processing operations on the exposure image.
  • the multi-frame exposure images include exposure images of 2 or more frames.
  • the multi-frame exposure images are all Bayer format images, that is, all are images in the RAW domain.
  • the multi-frame exposure images may be images taken continuously for the same scene to be shot, wherein the interval between two adjacent frames of exposure images obtained by exposure can be ignored relative to the exposure time of any frame of exposure images Excluding.
  • each exposure image in the multi-frame exposure images corresponds to an exposure start moment and an exposure end moment
  • the duration between the exposure start moment and the exposure end moment is the exposure time corresponding to the exposure image.
  • the exposure start time, exposure end time, and exposure time corresponding to the exposure image may be carried in the exposure image, or may be stored corresponding to the exposure image.
  • the embodiment of the present application does not limit the exposure method of the multi-frame exposure images, for example, the exposure time corresponding to each of the multi-frame exposure images increases sequentially according to the exposure order. Or, for example, the exposure time corresponding to each of the multi-frame exposure images decreases sequentially according to the exposure sequence. Among them, the time interval between any two exposures is ignored.
  • FIG. 3 shows two exposure methods provided by the embodiment of the present application.
  • the electronic device is continuously exposed for 6 times, and 6 frames of exposure images are obtained, which are respectively exposure images P1 to exposure images P6 .
  • the exposure time corresponding to the exposure image P1 is T1
  • the exposure time corresponding to the exposure image P2 is T2, T2>T1
  • the exposure time corresponding to the exposure image P3 is T3, T3>T2
  • the exposure time corresponding to the exposure image P5 is T5
  • the exposure time corresponding to the exposure image P6 is T6, and T6>T5.
  • the exposure image of the first frame to the exposure image of the fourth frame can be called the normal exposure image
  • the exposure time of the exposure image of the fifth frame and the exposure time of the sixth exposure image is longer than that of the normal exposure image, thus, it can be called the exposure time of the exposure image of the sixth frame.
  • the 5th frame exposure image and the 6th frame exposure image are long exposure images.
  • the electronic device is exposed continuously for 6 times, and 6 frames of exposure images are obtained, which are exposure image Q1 to exposure image Q6 respectively.
  • the exposure time corresponding to the exposure image Q1 is T21
  • the exposure time corresponding to the exposure image Q2 is T22, T21>T22
  • the exposure time corresponding to the exposure image Q3 is T23, T22>T23
  • the exposure time corresponding to the exposure image Q5 is T25
  • the exposure time corresponding to the exposure image Q6 is T26, and T25>T26.
  • the exposure image Q2 of the second frame and the exposure image of the third frame can be called normal exposure images
  • the exposure time of the exposure image of the first frame is longer than that of the normal exposure image, thus, the exposure image of the first frame can be called
  • the exposure time of the 4th frame to the 6th frame of the exposure image is shorter than the exposure time of the normal exposure image, thus, the 4th frame to the 6th frame of the exposure image can be short-exposure images.
  • the first fused image is located in the RGB color space, that is, located in the RGB domain.
  • each pixel in the first fused image in the RGB domain includes three color components, that is, each pixel includes a red component, a green component and a blue component.
  • the size of the first fused image may be the same as that of any one frame of exposure images in the multiple frames of exposure images.
  • the deep learning network model can perform at least one of noise reduction and demosaicing, and can also perform multiexposure fusion (mutiexpo fusion) and other processing.
  • demosaicing and noise reduction are operations related to detail restoration, performing demosaic processing first will affect the effect of noise reduction, and noise reduction first will affect the effect of demosaicing. Therefore, the embodiment of the present application will denoise and Demosaicing is implemented through a deep learning network model, which avoids the interaction between different processes and the accumulation of errors when multiple processes are performed in series, and improves the effect of image detail restoration.
  • multi-exposure fusion refers to the fusion of multiple frames of images with different exposure times.
  • the deep learning network model can be any one of Unet model, LLnet model and FCN model.
  • the deep learning network model can also be other models, which can be selected according to needs, and this embodiment of the present application does not impose any limitation on this.
  • the first back-end processing includes converting RGB domain to YUV domain.
  • the first fused image from the RGB domain to the YUV domain refers to converting the first fused image located in the RGB domain to an image located in the YUV domain, that is, at this time, the first back-end image is located in the YUV domain.
  • the data volume of the first backend image in the YUV domain is relatively small, and can better reflect the brightness, color and saturation information of the scene.
  • the first back-end processing may further include at least one of dynamic range control (dynamic range control, DRC) and tone mapping (tone mapping).
  • DRC dynamic range control
  • tone mapping tone mapping
  • dynamic range adjustment is used to provide compression and amplification capabilities.
  • the dynamic range of the current image may be mapped to a larger dynamic range, so that the brightness corresponding to the pixels in the bright area in the image is brighter, and the brightness corresponding to the pixels in the dark area is darker.
  • Tone mapping refers to the mapping and transformation of the image color.
  • the gray scale of the image can be adjusted through tone mapping, so that the processed image looks more comfortable to the human eye, and the image processed by tone mapping can better express the original image. information and features.
  • the obtained first fused image is an image in the RGB domain (that is, an RGB image)
  • its corresponding color only meets the display requirements of electronic devices
  • does not meet the viewing requirements of human vision it can be considered that the first fused image is a linear RGB image. Therefore, it is also necessary to perform dynamic range adjustment, tone mapping, and other processing on the first fused image, and process the first fused image into a non-linear RGB image, so as to be more suitable for viewing by human eyes.
  • FIG. 4 shows a schematic flowchart of a first back-end processing provided by an embodiment of the present application.
  • the first back-end processing includes dynamic range adjustment, tone mapping, and conversion from RGB domain to YUV domain in sequence.
  • first back-end processing may also include other steps, and the order of the steps included in the first back-end processing may be changed as required. There are no restrictions here.
  • performing the second back-end processing on the first long-exposure image is to multiplex the first long-exposure image among the acquired multi-frame exposure images, without additional acquisition, thereby reducing the amount of collected data.
  • the second back-end processing may be performed on one or more frames of the first long-exposure images in the multi-frame exposure images to obtain the second back-end images.
  • the first long-exposure image is located in the RAW domain
  • the second back-end image is located in the YUV domain.
  • the second back-end processing includes converting the RAW domain to the YUV domain.
  • the second back-end processing may also include other steps, and the order of the steps included in the second back-end processing may be changed as required. There are no restrictions here.
  • style transfer process refers to correcting the color of the first back-end image by using the color deviation between the first back-end image and the second back-end image, so as to improve the quality of the target image.
  • the first back-end image is obtained from a multi-frame exposure image through a series of processing such as noise reduction.
  • the corresponding noise is already very small and the definition is very high, but there will still be color cast in dark areas, which is quite different from the real scene.
  • the second back-end image is obtained from the first long-exposure image through the second back-end processing.
  • the first long-exposure image has a longer exposure time, and the color is more in line with the real scene.
  • the corresponding second back-end image The color is also more in line with the real scene.
  • the color of the first long-exposure image can be transferred to the first back-end image while retaining the first
  • the back-end image has the characteristics of low noise and high definition. Therefore, the target style transfer network model is used to perform style transfer processing on the first back-end image and the second back-end image, and a higher-quality target image can be obtained.
  • the style transfer network model needs to be trained. Therefore, the above method may also include the following S160.
  • the initial style transfer network model can be any one of Resnet model, vgg model, unet model and vnet model.
  • the determined target style transfer network model is the same as the original model corresponding to the initial style transfer network model.
  • multiple pairs of training images in the RGB domain can be used to train the initial style transfer network model to determine the target style transfer network model.
  • each pair of training images includes a first training image and a second training image, and both the first training image and the second training image are located in the RGB domain.
  • the first training image and the second training image are images shot for the same scene to be shot, that is, the first training image and the second training image include the same content.
  • the dark area refers to the area in the first training image and the second training image whose luminance value is smaller than a preset luminance value.
  • the color of the dark area in the first training image is color cast, but the color of the dark area in the second training image is normal.
  • first input the first training image in each pair of training images into the first feature extraction network model, and determine the first feature layer (feature map) corresponding to the first training image.
  • the second training image is input into the second feature extraction network model, and the second feature layer corresponding to the second training image is determined.
  • the first style transformation matrix or chromaticity deviation coefficient After the first style transformation matrix or chromaticity deviation coefficient is obtained, it can be applied to the first training image, and after the first training image is determined to perform style transfer based on the first style transformation matrix or chromaticity deviation coefficient, the corresponding image and Whether the color of the second training image is consistent or relatively close; or, it may also be determined whether the difference between the corresponding image and the pixel value of the second training image is less than a preset difference threshold (for example, 0.008). If it is judged that the colors of the two frames of images are relatively close, or the difference between the two frames of images is less than the preset difference threshold, at this time, it can be considered that the initial style transfer network model has been trained, and the trained initial style transfer The network model serves as the target style transfer network model.
  • a preset difference threshold for example, 0.008
  • the feature layer is used to represent an abstract feature extracted from an image, for example, the abstract feature may be the depth of a color.
  • the first style transformation matrix is used to represent the amount of chromaticity deviation between two frames of input images.
  • the first style transformation matrix includes a plurality of chromaticity deviations arranged in multiple rows and multiple columns, and each chromaticity deviation corresponds to the difference between the chromaticity at the same position of two frames of input images.
  • the chromaticity deviation coefficient is used to represent the chromaticity deviation magnitude between two frames of input images, wherein, the larger the chromaticity deviation coefficient, the larger the chromaticity deviation, and the smaller the chromaticity deviation coefficient, the smaller the chromaticity deviation.
  • both the first feature extraction network model and the second feature extraction network model can be any one of resnet model, vgg model, and mobilenet.
  • first feature extraction network model and the second feature extraction network model may be the same or different, and this embodiment of the present application does not impose any limitation on this.
  • the trained target style transfer network model can be used to determine the first style transformation matrix or chromaticity deviation coefficient between images in the RGB domain.
  • training images may be collected according to needs, which is not limited in this embodiment of the present application.
  • multiple pairs of training images in the YUV domain may be used to train the initial style transfer network model to determine the target style transfer network model.
  • the process of training the initial style transfer network model is similar to the process corresponding to the first embodiment above, which can be referred to the above description, and will not be repeated here.
  • the trained target style transfer network model can be used to determine the first style transformation matrix or chromaticity deviation coefficient between images in the YUV domain.
  • multiple pairs of training images located in the RGB domain may be used to train the initial style transfer network model to determine the target style transfer network model.
  • the first training image and the second training image in each pair of training images are first spliced to obtain a spliced training image; then, the spliced training image is processed using a feature network model to obtain a corresponding spliced feature layer; Then use the initial style transfer network model to process the spliced feature layer to obtain the corresponding first style transformation matrix or chromaticity deviation coefficient.
  • multiple pairs of training images in the YUV domain may be used to train the initial style transfer network model to determine the target style transfer network model.
  • the training process is similar to the process corresponding to the third embodiment, and reference may be made to the above description, which will not be repeated here.
  • the multi-frame exposure images are processed by using the deep learning network model to obtain the first fused image, and then the first back-end processing is performed on the first fused image to obtain the first back-end image; then , multiplexing a long-exposure image with a longer exposure time and closer to the real scene color, and performing second back-end processing on the long-exposure image to obtain a second back-end image.
  • the target style transfer network model is used to fuse the first back-end image and the second back-end image corresponding to the long-exposure image for the second time, and, while performing the second fusion, the second back-end image corresponding to the long-exposure image
  • the style of the image is transferred to the first back-end image, and the dark area of the first back-end image is color corrected, so as to obtain the target image close to the real scene color.
  • FIG. 8 shows a schematic flowchart of another image processing method provided by the embodiment of the present application. The method is applied to electronic equipment.
  • the image processing method provided by the embodiment of the present application may include S210 to S260 , and these steps will be described in detail below.
  • the exposure times of the multi-frame exposure images are different, and the multi-frame exposure images include at least one frame of the first long-exposure image.
  • the exposure time of the first long exposure image is longer than that of other exposure images in the multi-frame exposure images.
  • the exposure time of the second long-exposure image is longer than the exposure time of other exposure images in the multi-frame exposure images except the first long-exposure image.
  • the exposure time of the second long exposure image may be greater than or equal to the exposure time of the first long exposure image.
  • one or more frames of second long-exposure images may be acquired here.
  • the exposure times of multiple frames of second long-exposure images may be the same or different.
  • the electronic device may include one or more image sensors, then the electronic device may control the one or more image sensors to shoot, so as to obtain a multi-frame exposure image and a second long exposure image.
  • the electronic device may acquire the multi-frame exposure image and the second long exposure image from local storage or from other devices.
  • the electronic device can control the one or more image sensors to take pictures, so as to obtain a multi-frame exposure image, and obtain a second long exposure image through local storage or from other devices.
  • the electronic device may also acquire the second long exposure image in other ways, which is not limited in this embodiment of the present application.
  • the second long exposure image may be an exposure image directly generated by an image sensor, or may be an image obtained after performing one or more processing operations on the exposure image.
  • the second long-exposure image is a Bayer format image, that is, an image in the RAW domain.
  • the second long-exposure image and the multi-frame exposure image are images taken continuously for the same scene to be shot, wherein the sequence of shooting the second long-exposure image and the multi-frame exposure image can be performed as required, and this embodiment of the present application No restrictions are imposed.
  • the method may further include the following S251 to S253.
  • the light sensor may be used to perceive and store the ambient brightness value corresponding to the surrounding environment when the first back-end image is acquired. For example, if it is determined by the light sensor that the ambient brightness value corresponding to the surrounding environment is 120 when the first back-end image is acquired, which is greater than the preset ambient brightness value of 100, it can be determined that the surrounding ambient light is better, and the obtained first The color of the back-end image can be guaranteed, and there will be no large color deviation, so that style transfer processing is not required, and the first back-end image is directly output as the target image.
  • the ambient brightness value corresponding to the surrounding environment is 60, which is much smaller than the preset ambient brightness value of 100. At this time, it can be determined that the surrounding ambient light is very poor, and the obtained first The color of the back-end image cannot be guaranteed, and there will be a large color deviation in the dark area. Then, in order to solve the problem of color deviation in the dark area, it is necessary to use the target transfer style network model to perform style transfer processing on the first back-end image and the second back-end image, so as to obtain a target that is close to the real scene color and has no color deviation in the dark area. image.
  • the multi-frame exposure images are processed by using the deep learning network model to obtain the first fused image, and then the first back-end processing is performed on the first fused image to obtain the first back-end image; then , and then collect a long-exposure image with a longer exposure time and closer to the real scene color, and perform second back-end processing on the long-exposure image to obtain a second back-end image.
  • the target style transfer network model is used to fuse the first back-end image and the second back-end image corresponding to the long-exposure image for the second time, and, while performing the second fusion, the second back-end image corresponding to the long-exposure image
  • the style of the image is transferred to the first back-end image, and the dark area of the first back-end image is color corrected, so as to obtain the target image close to the real scene color.
  • the ambient brightness value of the first back-end image can also be screened.
  • the first back-end image meets the requirements, it means that the color deviation of the image is not serious, and the first back-end image can be directly output without further processing. , and the above series of processing is performed only on the first back-end image that does not meet the requirements, so as to correct the color of the dark area.
  • FIG. 9 shows a schematic flowchart of performing style transfer processing on the first back-end image and the second back-end image.
  • the process may include S310 to S350.
  • S310 Perform conversion from YUV domain to RGB domain on the first back-end image and the second back-end image respectively, to obtain a first intermediate image corresponding to the first back-end image and a second intermediate image corresponding to the second back-end image.
  • both the first back-end image and the second back-end image are located in the YUV domain, and after being converted from the YUV domain to the RGB domain, the obtained first intermediate image and the second intermediate image are both located in the RGB domain.
  • the target style transfer network model is generated by training with training images in the RGB domain, so that the target style transfer network model can process the first intermediate image and the second intermediate image in the RGB domain.
  • upsampling refers to enlarging an image, here, the original smaller-sized first style transformation matrix is enlarged into a larger-sized second style transformation matrix.
  • the size of the enlarged second style transformation matrix is the same as the size corresponding to the first intermediate image and the second intermediate image, the first back-end image and the second back-end image are the same as the first intermediate image and the second intermediate image, Then, the size of the second style transformation matrix is also the same as that of the first back-end image and the second back-end image.
  • the second style transformation matrix includes chromaticity deviations corresponding to pixels in multiple rows and columns.
  • the target style transfer network model will perform down-sampling in order to extract the first style transformation matrix, so that the size of the first style transformation matrix will be reduced relative to the first intermediate image and the second intermediate image.
  • the size of the first intermediate image and the second intermediate image is 512 ⁇ 512 ⁇ 3
  • the size of the obtained first style transformation matrix is 16 ⁇ 16 ⁇ 6. Therefore, in S330, it is necessary to upsample the first style transformation matrix to increase its size to obtain a second style transformation matrix with the same size as the first back-end image, so as to facilitate subsequent use of the second style transformation matrix to The first backend image continues processing.
  • determining the mask image corresponding to the second intermediate image is equivalent to determining the mask image corresponding to the first long exposure image; when the second intermediate image is When the second long-exposure image is obtained after processing, determining the mask image corresponding to the second intermediate image is equivalent to determining the mask image corresponding to the second long-exposure image.
  • the mask image is used to mask the dark area in the first long exposure image or the second long exposure image, so as to separately process the brightness area and the dark area in the image during subsequent style transfer.
  • the mask image is a binary image.
  • the mask image can be generated according to the brightness corresponding to each pixel.
  • the pixels in the second intermediate image are divided into bright areas and dark areas according to the brightness.
  • the corresponding value is 0, that is, it is white in the mask image; and
  • the corresponding value is 1, that is, it appears black in the mask image.
  • Suv(i,j) is used to represent the chromaticity value at the pixel position in the i-th row and j-th column in the first back-end image
  • Luv(i,j) is used to represent the i-th pixel in the second style transformation matrix
  • the chromaticity deviation value at the jth column position of the row, N(i,j) is used to represent the value of 0 or 1 at the pixel position of the ith row jth column in the mask image
  • Muv(i,j) is used Represents the target chromaticity value at the pixel position of row i, column j in the target image.
  • a is used to represent the first weight assigned to the first back-end image
  • b is used to represent the second weight assigned to the second style transformation matrix.
  • the above formula only processes the chromaticity value of the image to obtain the target chromaticity value corresponding to the target image, and the luminance value corresponding to the target image can be determined according to the luminance value at the same position of the first back-end image.
  • the brightness value corresponding to the target image may be equal to the brightness value at the same position of the first back-end image.
  • the luminance value and target chrominance value corresponding to each pixel in the target image can be obtained, and the target image thus obtained is an image in the YUV domain.
  • the target style transfer network model can directly apply the first back-end image and the second image in the YUV domain.
  • the back-end image is processed to obtain the first style transformation matrix.
  • a mask image of the second backend image is determined. Perform fusion processing on the first back-end image, the second style transformation matrix and the mask image to obtain the target image.
  • FIG. 10 shows another schematic flowchart of performing style transfer processing on the first back-end image and the second back-end image.
  • the process may include S410 to S460.
  • S410 Perform conversion from YUV domain to RGB domain on the first back-end image and the second back-end image respectively, to obtain a first intermediate image corresponding to the first back-end image and a second intermediate image corresponding to the second back-end image.
  • both the first back-end image and the second back-end image are located in the YUV domain, and the first intermediate image and the second intermediate image obtained after transcoding are both located in the RGB domain.
  • the target style transfer network model is generated by training with training images in the RGB domain, so that the target style transfer network model can process the first intermediate image and the second intermediate image in the RGB domain.
  • the chromaticity deviation coefficient is used to represent the corresponding relationship between brightness and chromaticity deviation values.
  • different chromaticity deviation coefficients can be obtained corresponding to different regions in the image, or different chromaticity deviation coefficients can be obtained corresponding to different pixel positions in the image.
  • Y(i, j) is used to represent the luminance value of the first intermediate image at the pixel position of row i, column j, and k(i, j) is used to represent the chromaticity at the position of column j, row i Deviation coefficient, L'uv(i,j) is used to represent the chromaticity deviation value at row i and column j in the first style transformation matrix, and f is used to represent Y(i,j)*k(i, j) has a functional mapping relationship with L'uv(i, j).
  • S440 to S460 is the same as the description of S330 to S350 above, and will not be repeated here.
  • the target style transfer network model can directly apply the first back-end image and the second image in the YUV domain.
  • the back-end image is processed to obtain the chromaticity deviation coefficient; according to the first back-end image and the chromaticity deviation coefficient, the first style transformation matrix is determined. Then perform up-sampling on the first style transformation matrix to obtain the second style transformation matrix. Then, a mask image of the second backend image is determined. Perform fusion processing on the first back-end image, the second style transformation matrix and the mask image to obtain the target image.
  • FIG. 11 is a schematic diagram of an effect provided by the embodiment of the present application.
  • a first fused image as shown in (a) in FIG. 11 may be obtained.
  • the area of the dark area in the first fused image is relatively large, and the sky and the ground, etc. belonging to the dark area in the first fused image have a problem of color cast. Therefore, the user experience is very bad.
  • Fig. 12 shows a schematic structural diagram of an electronic device applicable to this application.
  • the electronic device 100 may be used to implement the methods described in the foregoing method embodiments.
  • the electronic device 100 may be a mobile phone, a smart screen, a tablet computer, a wearable electronic device, a vehicle electronic device, an augmented reality (augmented reality, AR) device, a virtual reality (virtual reality, VR) device, a notebook computer, a super mobile personal computer ( ultra-mobile personal computer (UMPC), netbook, personal digital assistant (personal digital assistant, PDA), projector, etc.
  • augmented reality augmented reality
  • VR virtual reality
  • a notebook computer a super mobile personal computer ( ultra-mobile personal computer (UMPC), netbook, personal digital assistant (personal digital assistant, PDA), projector, etc.
  • UMPC ultra-mobile personal computer
  • PDA personal digital assistant
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and A subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • Processor 110 may include one or more processing units.
  • the processor 110 may include at least one of the following processing units: an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor) , ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, neural network processor (neural-network processing unit, NPU).
  • an application processor application processor, AP
  • modem processor graphics processing unit
  • graphics processing unit graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
  • connection relationship between the modules shown in FIG. 12 is only a schematic illustration, and does not constitute a limitation on the connection relationship between the modules of the electronic device 100 .
  • each module of the electronic device 100 may also adopt a combination of various connection modes in the foregoing embodiments.
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • the electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the display screen 194 is used to display images, videos and the like.
  • Camera 193 is used to capture images or videos. It can be triggered by an application command to realize the camera function, such as capturing images of any scene.
  • a camera may include components such as an imaging lens, an optical filter, and an image sensor. The light emitted or reflected by the object enters the imaging lens, passes through the filter, and finally converges on the image sensor.
  • the image sensor is mainly used for converging and imaging the light emitted or reflected by all objects in the camera perspective (also called the scene to be shot, the target scene, or the scene image that the user expects to shoot); the filter is mainly used to It is used to filter out redundant light waves (such as light waves other than visible light, such as infrared) in the light; the image sensor is mainly used to perform photoelectric conversion on the received light signal, convert it into an electrical signal, and input it into the processor 130 for subsequent processing .
  • the camera 193 may be located at the front of the electronic device 100, or at the back of the electronic device 100, and the specific number and arrangement of the cameras may be set according to requirements, which are not limited in this application.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the internal memory 121 may include an area for storing programs and an area for storing data.
  • the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
  • the storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
  • the internal memory 121 can also store the software code of the image processing method provided by the embodiment of the present application.
  • the processor 110 runs the software code, it executes the process steps of the image processing method to obtain an image with higher definition.
  • the internal memory 121 can also store captured images.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving files such as music in an external memory card.
  • the software code of the image processing method provided in the embodiment of the present application can also be stored in an external memory, and the processor 110 can run the software code through the external memory interface 120 to execute the process steps of the image processing method to obtain a high-definition image.
  • Image Images captured by the electronic device 100 may also be stored in an external memory.
  • the user can designate whether to store the image in the internal memory 121 or the external memory.
  • the electronic device 100 when the electronic device 100 is currently connected to the external memory, if the electronic device 100 captures one frame of image, a prompt message may pop up to remind the user whether to store the image in the external memory or the internal memory; of course, there may be other specified ways , the embodiment of the present application does not impose any limitation on this; alternatively, when the electronic device 100 detects that the memory capacity of the internal memory 121 is less than a preset amount, it may automatically store the image in the external memory.
  • the electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the camera 193 can capture multi-frame exposure images
  • the processor 110 performs image processing on the multi-frame exposure images, and the image processing can include noise reduction, dynamic range adjustment, tone mapping, transcoding, upsampling, fusion Processing, etc., through the image processing, the target image with better color effect can be obtained.
  • the processor 110 may control the display screen 194 to present the processed target image, where the target image is an image captured in a scene with low illumination.
  • the hardware system of the electronic device 100 is described in detail above, and the software system of the electronic device 100 is introduced below.
  • the software system may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • the embodiment of the present application uses a layered architecture as an example to exemplarily describe the software system of the electronic device 100 .
  • a software system adopting a layered architecture is divided into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
  • the software system can be divided into five layers, which are application layer 210 , application framework layer 220 , hardware abstraction layer 230 , driver layer 240 and hardware layer 250 from top to bottom.
  • the application layer 210 may include a camera, a gallery, and may also include applications such as calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
  • the application framework layer 220 provides application program access interfaces and programming frameworks for the applications of the application layer 210 .
  • the application framework layer includes a camera access interface, and the camera access interface is used to provide camera shooting services through camera management and camera equipment.
  • the camera management in the application framework layer 220 is used to manage cameras. Camera management can obtain camera parameters, such as judging the working status of the camera.
  • the camera device in the application framework layer 220 is used to provide a data access interface between different camera devices and camera management.
  • the hardware abstraction layer 230 is used to abstract hardware.
  • the hardware abstraction layer can include the camera hardware abstraction layer and other hardware device abstraction layers; the camera hardware abstraction layer can include camera device 1, camera device 2, etc.; the camera hardware abstraction layer can be connected with the camera algorithm library, and the camera hardware abstraction layer Algorithms in the camera algorithm library can be called.
  • the driver layer 240 is used to provide drivers for different hardware devices.
  • the driver layer may include camera drivers; digital signal processor drivers and graphics processor drivers.
  • the hardware layer 250 may include sensors, image signal processors, digital signal processors, graphics processors, and other hardware devices.
  • the sensor may include sensor 1, sensor 2, etc., and may also include a depth sensor (time of flight, TOF) and a multispectral sensor, etc., and there is no limitation in this embodiment of the present application.
  • the camera APP When the user performs a single-click operation on the touch sensor 180K, the camera APP is awakened by the single-click operation, and calls each camera device of the camera hardware abstraction layer through the camera access interface.
  • the camera hardware abstraction layer can send an instruction to call a certain camera to the camera device driver, and at the same time, the camera algorithm library starts to load the deep learning network model and the target style transfer network model used in the embodiment of the present application.
  • the sensor at the hardware layer When the sensor at the hardware layer is called, for example, call sensor 1 in a certain camera to obtain multiple frames of exposure images with different exposure times, return the multiple frames of exposure images to the hardware abstraction layer, and use the depth in the loaded camera algorithm library Learn the network model to perform noise reduction, multi-exposure fusion and other processing to generate the first fusion image; then call the image signal processor to perform dynamic range adjustment, tone mapping, RGB domain to YUV domain and other processing on the first fusion image, and at the same time, call the image The processor converts the multiplexed first long-exposure image or the additionally acquired second long-exposure image from the RGB domain to the YUV domain; then uses the target style transfer network model in the loaded camera algorithm library to perform style transfer processing to obtain the target image .
  • the obtained target image is sent back to the camera application through the camera hardware abstraction layer and the camera access interface for display and storage.
  • the device embodiment of the present application will be described in detail below with reference to FIG. 14 . It should be understood that the devices in the embodiments of the present application can execute the various methods in the foregoing embodiments of the present application, that is, the specific working processes of the following various products can refer to the corresponding processes in the foregoing method embodiments.
  • FIG. 14 is a schematic structural diagram of an image processing apparatus 300 provided by an embodiment of the present application.
  • the image processing device 300 includes an acquisition module 310 and a processing module 320 .
  • the acquiring module 310 is used to acquire multi-frame exposure images, the exposure times of the multi-frame exposure images are different, the multi-frame exposure images include at least one frame of the first long-exposure image, and the exposure time of the first long-exposure image is relative to the exposure time of the multi-frame exposure images The other exposure images in have longer exposure times.
  • the processing module 320 is used to perform style transfer processing on multi-frame exposure images to obtain target images.
  • processing module 320 is also used to:
  • the multi-frame exposure images are processed to obtain the first fusion image
  • processing module 320 is also used to:
  • the second back-end processing is performed on the first long-exposure image to obtain a second back-end image.
  • processing module 320 is also used to:
  • the exposure time of the second long exposure image is longer than the exposure time of other exposure images except the first long exposure image in the multi-frame exposure image;
  • the second back-end processing is performed on the second long-exposure image to obtain a second back-end image.
  • processing module 320 is also used to:
  • processing module 320 is also used to:
  • processing module 320 is also used to:
  • processing module 320 is also used to:
  • processing module 320 is also used to:
  • the first back-end image and the second back-end image are respectively converted from the YUV domain to the RGB domain;
  • the first back-end image and the second back-end image are processed using the target style transfer network model, including:
  • the first back-end image and the second back-end image after the YUV domain to RGB domain are processed are processed.
  • processing module 320 is also used to:
  • processing module 320 is also used to:
  • processing module 320 is also used to:
  • a feature network model is used for processing to obtain the corresponding feature layer
  • processing module 320 is also used to:
  • the initial style transfer network model is trained by splicing feature layers to obtain the target style transfer network model.
  • the deep learning network model is any one of Unet model, LLnet model and FCN model.
  • the first back-end processing includes: converting RGB domain to YUV domain.
  • the first back-end processing further includes: at least one of dynamic range adjustment and tone mapping.
  • the second back-end processing includes: converting RAW domain to YUV domain.
  • the target style transfer network model is any one of Resnet model, vgg model, unet model and vnet model.
  • module may be implemented in the form of software and/or hardware, which is not specifically limited.
  • a “module” may be a software program, a hardware circuit or a combination of both to realize the above functions.
  • the hardware circuitry may include application specific integrated circuits (ASICs), electronic circuits, processors (such as shared processors, dedicated processors, or group processors) for executing one or more software or firmware programs. etc.) and memory, incorporating logic, and/or other suitable components to support the described functionality.
  • ASICs application specific integrated circuits
  • processors such as shared processors, dedicated processors, or group processors for executing one or more software or firmware programs. etc.
  • memory incorporating logic, and/or other suitable components to support the described functionality.
  • the units of each example described in the embodiments of the present application can be realized by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores computer instructions; when the computer-readable storage medium is run on the device for determining the included angle of the folding screen, the image
  • the processing device 300 executes the aforementioned image processing method.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server, or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or may be a data storage device including one or more servers, data centers, etc. that can be integrated with the medium.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium, or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)) and the like.
  • the embodiment of the present application also provides a computer program product including computer instructions, which, when run on the image processing apparatus 300 , enables the image processing apparatus 300 to execute the aforementioned image processing method.
  • FIG. 15 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • the chip shown in FIG. 15 may be a general-purpose processor or a special-purpose processor.
  • the chip includes a processor 401 .
  • the processor 401 is used to support the image processing apparatus 300 to execute the technical solution shown above.
  • the chip further includes a transceiver 402, and the transceiver 402 is used to accept the control of the processor 401, and is used to support the image processing apparatus 300 to execute the aforementioned technical solutions.
  • the chip shown in FIG. 15 may further include: a storage medium 403 .
  • the chip shown in Figure 15 can be implemented using the following circuits or devices: one or more field programmable gate arrays (field programmable gate array, FPGA), programmable logic device (programmable logic device, PLD) , controllers, state machines, gate logic, discrete hardware components, any other suitable circuitry, or any combination of circuitry capable of performing the various functions described throughout this application.
  • field programmable gate array field programmable gate array, FPGA
  • programmable logic device programmable logic device
  • controllers state machines, gate logic, discrete hardware components, any other suitable circuitry, or any combination of circuitry capable of performing the various functions described throughout this application.
  • the electronic equipment, image processing apparatus 300, computer storage medium, computer program product, and chip provided by the above-mentioned embodiments of the present application are all used to execute the method provided above. Therefore, the beneficial effects that it can achieve can refer to the above provided The beneficial effects corresponding to the method will not be repeated here.
  • sequence numbers of the above processes do not mean the order of execution, and the execution order of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
  • presetting and predefining can be realized by pre-saving corresponding codes, tables or other methods that can be used to indicate related information in devices (for example, including electronic devices) , the present application does not limit its specific implementation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)
  • Studio Devices (AREA)

Abstract

本申请提供一种图像处理方法及其相关设备,涉及图像处理领域,其中,该方法包括:显示第一界面,第一界面包括第一控件;检测到对第一控件的第一操作;响应于第一操作,获取多帧曝光图像;对多帧曝光图像进行风格迁移处理,得到目标图像。本申请提供的图像处理方法可以有效改善从光照度较低的场景,或者从高动态范围的场景中采集的多帧图像进行融合后的图像颜色。

Description

图像处理方法及其相关设备
本申请要求于2021年12月31日提交国家知识产权局、申请号为202111677018.1、申请名称为“图像处理方法及其相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理领域,尤其涉及一种图像处理方法及其相关设备。
背景技术
随着电子设备的广泛使用,使用电子设备进行拍照已经成为人们生活中的一种日常行为方式。以电子设备为手机为例,随之出现了多种算法以提升图像质量。
但是,在一些光线较暗的场景中,例如在夜景场景中,由于场景中的光照较低,所以,手机在拍摄成像时,信号比较弱,生成的图像颜色将会出现偏差,相关技术对此无法有效解决。由此,如何校正光照较低的场景中采集的图像的颜色,成为了一个亟需解决的问题。
发明内容
本申请提供一种图像处理方法及其相关设备,能够对光照较低的场景,或者从高动态范围的场景中所采集的图像的暗区进行颜色校正,提升质量,从而提高用户体验。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种图像处理方法,该方法包括:
显示第一界面,所述第一界面包括第一控件;检测到对所述第一控件的第一操作;响应于所述第一操作,获取多帧曝光图像,多帧所述曝光图像的曝光时间不同,多帧所述曝光图像包括至少1帧第一长曝光图像,所述第一长曝光图像的曝光时间相对于多帧所述曝光图像中的其他曝光图像的曝光时间较长;对多帧所述曝光图像进行风格迁移处理,得到目标图像。
本申请实施例提供了一种图像处理方法,通过对多帧曝光图像进行风格迁移处理,从而能够对光照较低的场景,或者从高动态范围的场景中所采集的图像的暗区进行颜色校正,提升质量,从而提高用户体验。
在第一方面一种可能的实现方式中,在对多帧所述曝光图像进行风格迁移处理,得到目标图像之前,所述方法还包括:利用深度学习网络模型,对多帧所述曝光图像进行处理,得到第一融合图像;对所述第一融合图像进行第一后端处理,得到第一后端图像。
可选地,深度学习网络模型可以进行降噪、去马赛克中的至少一项,还可以进行多曝光融合等处理。
在该实现方式中,本申请实施例将降噪、去马赛克、多曝光融合均通过一个深度学习网络模型来实现,避免了多种处理串行进行时,不同处理之间的相互影响,以及 所带来的错误累计,提升了图像细节恢复的效果。
在第一方面一种可能的实现方式中,所述方法还包括:对所述第一长曝光图像进行第二后端处理,得到第二后端图像。
在该实现方式中,通过复用多帧曝光图像中的第一长曝光图像,从而后续可以利用该第一长曝光图像对应的第二后端图像来风格迁移处理。此时,数据量较少。
在第一方面一种可能的实现方式中,所述方法还包括:获取第二长曝光图像,所述第二长曝光图像的曝光时间相对于多帧所述曝光图像中除所述第一长曝光图像之外的其他曝光图像的曝光时间较长;对所述第二长曝光图像进行第二后端处理,得到第二后端图像。
在该实现方式中,通过额外采集一帧曝光时间较长的长曝光图像,从而后续可以利用该额外采集的第二长曝光图像对应的第二后端图像来进行风格迁移处理。
在第一方面一种可能的实现方式中,对多帧所述曝光图像进行风格迁移处理,得到目标图像,包括:利用目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行风格迁移处理,得到所述目标图像。
在该实现方式中,通过复用第一长曝光图像,根据第一长曝光图像确定对应的第二后端图像,或者通过额外采集第二长曝光图像,根据第二长曝光图像确定对应的第二后端图像,然后,再利用目标风格迁移网络模型,将第二后端图像对应的风格迁移到第一后端图像上,实现对目标图像的暗区的颜色校正。
在第一方面一种可能的实现方式中,所述方法还包括:判断所述第一后端图像对应的环境亮度值是否小于预设环境亮度值;若否,则将所述第一后端图像作为所述目标图像输出;若是,则利用所述目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行风格迁移处理,得到所述目标图像。
在该实现方式中,通过对第一后端图像对应的环境亮度值进行判断,可以将颜色偏差不严重的图像筛选出来,对这一部分图像不进行风格迁移处理,以减少计算量,提高处理效率。
在第一方面一种可能的实现方式中,利用目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行处理,得到所述目标图像,包括:利用所述目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行处理,得到第一风格变换矩阵;对所述第一风格变换矩阵进行上采样,得到第二风格变换矩阵;确定所述第二后端图像对应的掩膜图像;对所述第一后端图像、所述第二风格变换矩阵和所述掩膜图像进行融合处理,得到所述目标图像。
其中,第一风格变换矩阵用于表示两帧输入目标风格迁移网络模型的图像之间的色度偏差量。此处,用于表示第一后端图像和第二后端图像之间的色度偏差量。
在该实现方式中,通过目标风格迁移网络模型可确定色度偏差量,因此,可以利用目标风格迁移网络模型确定出第一后端图像和第二后端图像之间的色度偏差量,再结合掩膜图像,来直接对第一后端图像进行风格迁移。
在第一方面一种可能的实现方式中,利用所述目标风格迁移网络模型对所述第一后端图像和所述第二后端图像进行处理,得到第一风格变换矩阵,包括:利用所述目标风格迁移网络模型对所述第一后端图像和所述第二后端图像进行处理,得到色度偏 差系数;根据所述第一后端图像和所述色度偏差系数,确定所述第一风格变换矩阵。
其中,色度偏差系数用于表示两帧输入目标风格迁移网络模型的图像之间的色度偏差幅度。此处,色度偏差系数用于表示第一后端图像和第二后端图像之间的色度偏差幅度。色度偏差系数、第一后端图像的亮度值以及色度偏度量具有函数映射关系。
在该实现方式中,通过目标风格迁移网络模型可确定色度偏差系数,因此,可以利用目标风格迁移网络模型确定出第一后端图像和第二后端图像之间的色度偏差系数,结合第一后端图像和色度偏差系数间接确定第一风格变换矩阵,然后,再结合掩膜图像,来对第一后端图像进行风格迁移。
在第一方面一种可能的实现方式中,所述方法还包括:对所述第一后端图像和所述第二后端图像分别进行YUV域转RGB域处理;利用所述目标风格迁移网络模型对所述第一后端图像和所述第二后端图像进行处理,包括:利用所述目标风格迁移网络模型,对进行了YUV域转RGB域处理后的所述第一后端图像和所述第二后端图像进行处理。
在该实现方式中,若目标风格迁移网络模型是基于位于RGB域的训练图像训练生成的,则在进行处理时,需将输入的第一后端图像和第二后端图像转成RGB域。
在第一方面一种可能的实现方式中,所述方法还包括:利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型;其中,多对所述训练图像均位于RGB域,每对所述训练图像的内容相同但暗区对应的颜色不同。
在该实现方式中,通过利用位于RGB域的训练图像对初始风格网络模型进行训练,由此,训练好的目标风格迁移网络模型可以用于确定位于RGB域的图像之间的第一风格变换矩阵或色度偏差系数。
在第一方面一种可能的实现方式中,所述方法还包括:利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型;其中,多对所述训练图像均位于YUV域,每对所述训练图像的内容相同但暗区对应的颜色不同。
在该实现方式中,通过利用位于YUV域的训练图像对初始风格网络模型进行训练,由此,训练好的目标风格迁移网络模型可以用于确定位于YUV域的图像之间的第一风格变换矩阵或色度偏差系数。
在第一方面一种可能的实现方式中,利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型,包括:针对每对所述训练图像中的每帧所述训练图像,利用1个特征网络模型进行处理,得到对应的特征层;利用每对所述训练图像对应的2个特征层对所述初始风格迁移网络模型进行训练,得到所述目标风格迁移网络模型。
在该实现方式中,可以分开对每对训练图像的2帧训练图像进行特征提取。
在第一方面一种可能的实现方式中,利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型,包括:将每对所述训练图像包括的2帧所述训练图像进行拼接,得到拼接训练图像;利用特征网络模型对所述拼接训练图像进行处理,得到对应的拼接特征层;利用所述拼接特征层对所述初始风格迁移网络模型进行训练,得到所述目标风格迁移网络模型。
在该实现方式中,可以拼接后,合起来对每对训练图像的2帧训练图像进行特征 提取。
在第一方面一种可能的实现方式中,所述特征提取网络模型为resnet模型、vgg模型、mobilenet中的任意一种。
在第一方面一种可能的实现方式中,所述深度学习网络模型为Unet模型、LLnet模型和FCN模型中的任意一种。
在第一方面一种可能的实现方式中,所述第一后端处理包括:RGB域转YUV域。
在第一方面一种可能的实现方式中,所述第一后端处理还包括:动态范围调整、色调映射中的至少一项。
在第一方面一种可能的实现方式中,所述第二后端处理包括:RAW域转YUV域。
在第一方面一种可能的实现方式中,所述目标风格迁移网络模型为Resnet模型、vgg模型、unet模型、vnet模型中的任意一种。
第二方面,提供了一种图像处理装置,该装置包括用于执行以上第一方面或第一方面的任意可能的实现方式中各个步骤的单元。
第三方面,提供了一种电子设备,包括:一个或多个处理器和存储器;
存储器与一个或多个处理器耦合,述存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,一个或多个处理器调用计算机指令以使得电子设备执行如第一方面或第一方面的任意可能的实现方式中提供的图像处理方法中进行处理的步骤。
第四方面,提供了一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有芯片的设备执行如第一方面或第一方面的任意可能的实现方式中提供的图像处理方法中进行处理的步骤。
第五方面,提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序包括程序指令,程序指令当被处理器执行时,使处理器执行如第一方面或第一方面的任意可能的实现方式中提供的图像处理方法中进行处理的步骤。
第六方面,提供了一种计算机程序产品,计算机程序产品包括存储了计算机程序的计算机可读存储介质,计算机程序使得计算机执行如第一方面或第一方面的任意可能的实现方式中提供的图像处理方法中进行处理的步骤。
第二方面至第六方面的有益效果,可以参考上述第一方面的有益效果,在此不再赘述。
附图说明
图1是本申请实施例提供的一种应用场景;
图2是本申请实施例提供的一种图像处理方法的流程示意图;
图3是本申请实施例提供的两种曝光方式;
图4是本申请实施例提供的一种第一后端处理的流程示意图;
图5是本申请实施例提供的一种第二后端处理的流程示意图;
图6是本申请实施例提供的一种训练目标风格迁移网络模型的流程示意图;
图7是本申请实施例提供的另一种训练目标风格迁移网络模型的流程示意图;
图8是本申请实施例提供的另一种图像处理方法的流程示意图;
图9是本申请实施例提供的一种风格迁移处理的流程示意图;
图10是本申请实施例提供的另一种风格迁移处理的流程示意图;
图11是本申请实施例提供的一种效果示意图;
图12是一种适用于本申请的电子设备的硬件系统的示意图;
图13是一种适用于本申请的电子设备的软件系统的示意图;
图14为本申请实施例提供的一种图像处理装置的结构示意图;
图15为申请实施例提供的一种芯片的结构示意图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“多个”是指两个或多于两个。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
首先,对本申请实施例中的部分用语进行解释说明,以便于本领域技术人员理解。
1、RGB(red,green,blue)颜色空间,或RGB域,指的是一种与人的视觉系统结构相关的颜色模型。根据人眼睛的结构,将所有颜色都当作是红色、绿色和蓝色的不同组合。
2、YUV颜色空间,指的是一种颜色编码方法,Y表示亮度(luminance或luma),U和V表示的则是色度(chrominance或chroma)。上述RGB颜色空间着重于人眼对色彩的感应,YUV颜色空间则着重于视觉对亮度的敏感程度,RGB颜色空间和YUV颜色空间可以互相转换。
3、像素值,指的是位于RGB颜色空间的彩色图像中每个像素对应的一组颜色分量。例如,每个像素对应一组三基色分量,其中,三基色分量分别为红色分量R、绿色分量G和蓝色分量B。
4、拜耳格式图像(bayer image),即基于拜耳格式彩色滤波阵列的图像传感器输出的图像。该图像中的多种颜色的像素以拜耳格式进行排布。其中,拜耳格式图像中的每个像素仅对应一种颜色的通道信号。示例性的,由于人的视觉对绿色较为敏感,所以,可以设定绿色像素(对应绿色通道信号的像素)占全部像素的50%,蓝色像素(对应蓝色通道信号的像素)和红色像素(对应红色通道信号的像素)各占全部像素的25%。其中,拜耳格式图像的最小重复单元为:一个红色像素、两个绿色像素和一个蓝色像素以2×2的方式排布。其中,以拜耳格式进行排布的图像可以认为其位于RAW域。
5、拍摄参数,拍摄参数可包括快门、曝光时间、光圈值(aperture value,AV)、曝光值(exposure value,EV)和感光度ISO。以下分别进行介绍。
快门是控制光线进入摄像头时间长短,以决定图像曝光时间的装置。快门保持在开启状态的时间越长,进入摄像头的光线越多,图像对应的曝光时间越长。相反,快门保持在开启状态的时间越短,进入摄像头的光线越少,图像对应的曝光时间越短。
曝光时间是指为了将光投射到摄像头的感光材料的感光面上,快门所要打开的时间。曝光时间由感光材料的感光度和感光面上的照度确定。曝光时间越长,进入摄像头的光越多,曝光时间越短,进入摄像头的光越少。因此,暗光场景下需要长的曝光时间,逆光场景下需要短的曝光时间。
光圈值,是摄像头中的镜头(lens)的焦距与镜头通光直径的比值。光圈值越大,进入摄像头的光线越多。光圈值越小,进入摄像头的光线越少。
曝光值,是曝光时间和光圈值组合起来表示摄像头的镜头通光能力的一个数值。曝光值可以定义为:
Figure PCTCN2022113424-appb-000001
其中,N为光圈值;t为曝光时间,单位为秒。
ISO,用于衡量底片对于光的灵敏程度,即感光度或增益。对于不敏感的底片,需要更长的曝光时间以达到跟敏感底片亮度相同的成像。对于敏感的底片,需要较短的曝光时间以达到与不敏感的底片亮度相同的成像。
拍摄参数中,快门、曝光时间、光圈值、曝光值和ISO,电子设备可通过算法实现自动对焦(auto focus,AF)、自动曝光(automatic exposure,AE)、自动白平衡(auto white balance,AWB)中的至少一项,以实现这些拍摄参数的自动调节。
以上是对本申请实施例所涉及的名词的简单介绍,以下不再赘述。
随着电子设备的广泛使用,使用电子设备进行拍照已经成为人们生活中的一种日常行为方式。以电子设备为手机为例,随之出现了多种算法以提升图像质量。
但是,在一些光线较暗的夜景场景或高动态范围(high dynamic range,HDR)场景中,针对采集的图像暗区会出现偏色的问题,相关技术却无法有效解决。
示例性的,图1示出了一种本申请实施例适用的应用场景。
如图1中的(a)所示,在该场景中,除了路灯照射的区域,场景中的其他区域光照相对较低,利用手机对该场景进行拍照时,可以称路灯照射的区域对应为亮区,其他区域对应为暗区。如图1中的(b)所示,由于场景中暗区的光照相对亮区较低,所以,手机在拍摄成像时,暗区对应信号相比亮区对应信号较弱,生成的图像中暗区的颜色相对亮区的颜色出现的偏差也较大。若此时还有其他干扰,暗区受干扰影响的程度也相对比亮区受干扰影响的程度要严重,那么,结合干扰的影响,暗区的颜色会偏差更大。
结合图1所示,若在亮度和暗区都放置了一个颜色为紫色的皮球,当手机在拍摄成像时,亮区对应的信号相对较强,颜色正常,也即拍摄得到的图像中位于亮区的皮球依然呈紫色,而暗区对应的信号相对较弱,颜色有偏差,比如此时本来为紫色的皮球在图像中却呈蓝紫色,色调偏蓝。
由此,如何校正光照较低的场景,或者局部光照较低的场景中采集的图像的颜色, 成为了一个亟需解决的问题。
有鉴于此,本申请实施例提供了一种图像处理方法,在相关图像处理方法确定出的第一融合图像的基础上,结合一帧曝光时间较长的长曝光图像,在进行二次融合的同时将长曝光图像对应的风格迁移到第一融合图像上,对第一融合图像的暗区进行颜色校正,从而得到更接近真实场景颜色的目标图像。
应理解,上述图1所示的场景为对应用场景的举例说明,并不对本申请的应用场景进行任何限制。本申请实施例提供的图像处理方法可以应用但不限于以下场景中:
拍摄图像、录制视频、视频通话、视频会议应用、长短视频应用、视频直播类应用、视频网课应用、智能运镜应用场景、系统相机录像功能录制视频、视频监控以及智能猫眼等拍摄类场景等。
下面结合附图对本申请实施例提供的图像处理方法进行详细介绍。
图2示出了本申请实施例提供的一种图像处理方法的流程图。该方法应用于电子设备。
如图2所示,本申请实施例提供的图像处理方法可以包括以下S110至S150。下面对这些步骤进行详细的描述。
S110、获取多帧曝光图像。
多帧曝光图像的曝光时间不同,多帧曝光图像包括至少1帧第一长曝光图像。第一长曝光图像的曝光时间相对于多帧曝光图像中的其他曝光图像的曝光时间较长。
应理解,多帧曝光图像的曝光时间不同指的是:多帧曝光图像中至少两帧曝光图像对应的曝光时间不同,或者,多帧曝光图像中每帧曝光图像对应的曝光时间均不同。
其中,曝光时间相对较长的曝光图像可以称为长曝光图像,曝光时间相对较短的曝光图像可以称为短曝光图像,曝光时间位于长曝光图像对应的时间和短曝光图像对应的时间之间的图像,可以称为正常曝光图像。此处,可以理解的是,“长曝光”、“短曝光”以及“正常曝光”的概念是相对的,长曝光图像、正常曝光图像和短曝光图像分别对应的曝光时间的长短可以根据需要进行划分和修改,本申请实施例对此不进行任何限制。长曝光图像包括上述所述的第一长曝光图像。
当多帧曝光图像包括多帧第一长曝光图像时,多帧第一长曝光图像分别对应的曝光时间可以相同,也可以不相同,本申请实施例对此不进行任何限制。
在一些实施例中,电子设备中可以包括一个或多个图像传感器,那么,电子设备可以控制该一个或多个图像传感器进行拍摄,从而得到多帧曝光图像。在另一些实施例中,无论电子设备中是否包括图像传感器,电子设备都可以从本地存储或者从其他设备获取多帧曝光图像。例如,用户可以通过第一电子设备D1拍摄得到多帧曝光图像,然后将该多帧曝光图像发送给第二电子设备D2,第二电子设备D2在接收到该多帧曝光图像之后,可以执行本申请实施例提供的图像处理方法以进行图像处理。当然,在实际应用过程中,电子设备还可以通过其他方式来获取多帧曝光图像,本申请实施例对此不进行任何限制。
应理解,多帧曝光图像可以是直接由图像传感器生成的曝光图像,也可以是由对该曝光图像进行一种或多种处理操作之后得到的图像。
应理解,多帧曝光图像包括2帧及2帧以上的曝光图像。多帧曝光图像均为拜尔格式图像,也即,均为RAW域的图像。
应理解,多帧曝光图像可以是对同一待拍摄场景连续拍摄的图像,其中,曝光得到相邻两帧曝光图像之间的间隔,相对于任一帧曝光图像的曝光时间来说,都可以忽略不计。
应理解,多帧曝光图像中的每个曝光图像对应一个曝光起始时刻和曝光结束时刻,该曝光起始时刻到曝光结束时刻之间的时长即为曝光图像对应的曝光时间。其中,曝光图像对应的曝光起始时刻、曝光结束时刻、曝光时间均可以携带在曝光图像中,也可以是与该曝光图像对应存储起来的。
此处,本申请实施例对多帧曝光图像的曝光方式不作限定,比如,多帧曝光图像各自对应的曝光时间按照曝光次序,依次增长。或者,比如,多帧曝光图像各自对应的曝光时间按照曝光次序,依次减少。其中,任意两次曝光之间的时间间隔忽略不计。
示例性的,图3示出了本申请实施例提供的两种曝光方式。如图3中的(a)所示,电子设备连续曝光6次,得到6帧曝光图像,分别为曝光图像P1至曝光图像P6。其中,曝光图像P1对应的曝光时间为T1,曝光图像P2对应的曝光时间为T2,T2>T1;曝光图像P3对应的曝光时间为T3,T3>T2;依次类推,曝光图像P5对应的曝光时间为T5,曝光图像P6对应的曝光时间为T6,T6>T5。
其中,可以称第1帧曝光图像P1至第4帧曝光图像为正常曝光图像,第5帧曝光图像和第6曝光图像的曝光时间相对正常曝光图像的曝光时间较长,由此,可以称第5帧曝光图像和第6帧曝光图像为长曝光图像。
如图3中的(b)所示,电子设备连续曝光6次,得到6帧曝光图像,分别为曝光图像Q1至曝光图像Q6。其中,曝光图像Q1对应的曝光时间为T21,曝光图像Q2对应的曝光时间为T22,T21>T22;曝光图像Q3对应的曝光时间为T23,T22>T23;依次类推,曝光图像Q5对应的曝光时间为T25,曝光图像Q6对应的曝光时间为T26,T25>T26。
其中,可以称第2帧曝光图像Q2和第3帧曝光图像为正常曝光图像,第1帧曝光图像的曝光时间相对正常曝光图像的曝光时间较长,由此,可以称第1帧曝光图像为长曝光图像,第4帧曝光图像至第6帧曝光图像的曝光时间相对正常曝光图像的曝光时间较短,由此,可以成第4帧曝光图像至第6帧曝光图像为短曝光图像。
S120、利用深度学习网络模型,对多帧曝光图像进行处理,得到多帧曝光图像对应的第一融合图像。
其中,第一融合图像位于RGB颜色空间,也即位于RGB域。
应理解,位于RGB域的第一融合图像中的每个像素均包括三个颜色分量,即每个像素均包括红色分量、绿色分量和蓝色分量。此处,第一融合图像的尺寸可以与多帧曝光图像中的任意一帧曝光图像的尺寸相同。
可选地,深度学习网络模型可以进行降噪、去马赛克中的至少一项,还可以进行多曝光融合(mutiexpo fusion)等处理。
应理解,在使用图像传感器获取多帧曝光图像时,外界环境中的光照程度较低,以及图像传感器本身的性能将使得生成的多帧曝光图像具有大量噪声,这些噪声会使 得曝光图像整体变得模糊,丢失很多细节,所以需要进行降噪,以降低噪声的影响。
应理解,由于去马赛克和降噪均为与细节恢复相关的运算,而先进行去马赛克处理会影响降噪效果,先降噪会影响去马赛克的效果,因此,本申请实施例将降噪和去马赛克均通过一个深度学习网络模型来实现,避免了多种处理串行进行时,不同处理之间的相互影响,以及所带来的错误累计,提升了图像细节恢复的效果。
应理解,多曝光融合指的是将多帧曝光时间不同的图像进行融合。
可选地,深度学习网络模型可以为Unet模型、LLnet模型和FCN模型中的任意一种。
当然,深度学习网络模型也可以为其他模型,具体可以根据需要进行选择,本申请实施例对此不进行任何限制。
S130、对第一融合图像进行第一后端处理,得到第一融合图像对应的第一后端图像。
可选地,第一后端处理包括RGB域转YUV域。
当对第一融合图像进行RGB域转YUV域时,指的是将位于RGB域的第一融合图像转换为位于YUV域的图像,也就是说,此时,第一后端图像位于YUV域。
应理解,位于YUV域的第一后端图像的数据量相对较小,且更能反应场景的亮度、色彩和饱和度信息。
可选地,第一后端处理还可以包括动态范围调整(dynamic range control,DRC)、色调映射(tone mapping)中的至少一项。
应理解,动态范围调整用于提供压缩和放大能力。例如可以将当前图像动态范围映射到更大的动态范围上,使得图像中亮区的像素对应的亮度更亮,暗区的像素对应的亮度更暗。
色调映射指对图像颜色进行映射变换,例如,可以通过色调映射调整图像的灰度,使得处理后的图像人眼看起来更加舒适,而经过色调映射处理后的图像可以更好的表达原图里的信息与特征。
此处,当利用深度学习网络模型对多帧曝光图像进行处理之后,得到的第一融合图像虽然为位于RGB域的图像(也即RGB图像),但是其对应的色彩仅满足电子设备的显示需求,不符合人眼视觉的观看需求,可以认为该第一融合图像为线性的RGB图像。因此,还需要对第一融合图像进行动态范围调整、色调映射等处理,将第一融合图像处理成非线性的RGB图像,以更适合人眼观看。
图4示意出了本申请实施例提供的一种第一后端处理的流程示意图。
如图4所示,第一后端处理按顺序依次包括动态范围调整、色调映射和RGB域转YUV域。
当然,上述仅为一种第一后端处理的示例,第一后端处理还可以包括其他步骤,以及第一后端处理包括的多个步骤的顺序可以根据需要进行更改,本申请实施例对此不进行任何限制。
S140、对第一长曝光图像进行第二后端处理,得到第二后端图像。
此处,对第一长曝光图像进行第二后端处理,是对获取的多帧曝光图像中的第一长曝光图像进行了复用,无需再另外获取,从而可以减少采集的数据量。其中,可以 针对多帧曝光图像中的1帧或多帧第一长曝光图像进行第二后端处理,得到第二后端图像。
应理解,第一长曝光图像位于RAW域,第二后端图像位于YUV域。
可选地,如图5所示,第二后端处理包括RAW域转YUV域。
当然,上述仅为一种第二后端处理的示例,第二后端处理还可以包括其他步骤,以及第二后端处理包括的多个步骤的顺序可以根据需要进行更改,本申请实施例对此不进行任何限制。
S150、利用目标风格迁移网络模型,对第一后端图像和第二后端图像进行风格迁移处理,得到目标图像。
应理解,风格迁移处理指的是利用第一后端图像和第二后端图像之间的色彩偏差,对第一后端图像的色彩进行校正,从而提高目标图像的质量。
应理解,第一后端图像由多帧曝光图像经过降噪等一系列处理得到的,相应噪声已经非常小,清晰度非常高,但是依然会存在暗区偏色,与真实场景存在较大差异的问题,而第二后端图像是由第一长曝光图像经过第二后端处理得到的,第一长曝光图像由于曝光时间较长,颜色更符合真实场景,相应的第二后端图像的颜色也更符合真实场景,因此,为了在第一后端图像的基础上,解决存在的暗区偏色问题,可以将第一长曝光图像的颜色迁移给第一后端图像,同时保留第一后端图像噪声低、清晰度高的特点,由此利用目标风格迁移网络模型对第一后端图像和第二后端图像进行风格迁移处理,可以得到质量较高的目标图像。
可选地,在利用目标风格迁移网络模型进行风格迁移处理之前,还需要训练得到风格迁移网络模型,因此,上述方法还可以包括以下S160。
S160、利用多对训练图像,对初始风格迁移网络模型进行训练,确定目标风格迁移网络模型。
其中,初始风格迁移网络模型可以为Resnet模型、vgg模型、unet模型、vnet模型中的任意一种。相应的,确定出的目标风格迁移网络模型与初始风格迁移网络模型对应的原始模型相同。
可选地,作为第一种可实现方式,如图6所示,可以利用多对位于RGB域的训练图像,对初始风格迁移网络模型进行训练,以确定目标风格迁移网络模型。
例如,每对训练图像包括第一训练图像和第二训练图像,第一训练图像和第二训练图像均位于RGB域。第一训练图像和第二训练图像是对同一待拍摄场景拍摄的图像,也即,第一训练图像和第二训练图像所包括的内容相同。
但是,第一训练图像和第二训练图像中的暗区对应的颜色不同。暗区指的是第一训练图像和第二训练图像中亮度值小于预设亮度值的区域。例如,第一训练图像的暗区颜色偏色,第二训练图像中的暗区的颜色正常。
在进行训练时,先将每对训练图像中的第一训练图像输入第一特征提取网络模型,确定第一训练图像对应的第一特征层(feature map)。将第二训练图像输入第二特征提取网络模型,确定第二训练图像对应的第二特征层。
然后,将第一特征层和第二特征层输入初始风格迁移网络模型中,确定第一训练图像和第二训练图像之间对应的第一风格变换矩阵或色度偏差系数。
在得到第一风格变换矩阵或色度偏差系数之后,可以将其作用于第一训练图像,确定第一训练图像基于第一风格变换矩阵或色度偏差系数进行风格迁移之后,所对应的图像与第二训练图像的颜色是否一致,或比较接近;或者,还可以确定所对应的图像与第二训练图像的像素值之间的差值是否小于预设差值阈值(例如,0.008)。若判断到两帧图像的颜色比较接近,或两帧图像之间的差值小于预设差值阈值,此时,可以认为初始风格迁移网络模型已经训练好了,将该训练好的初始风格迁移网络模型作为目标风格迁移网络模型。
若没有满足上述条件,则可以利用反向传播算法,调整初始风格迁移网络模型中的相关参数,继续利用其他训练图像对初始风格迁移网络模型进行训练,直到训练得到符合要求的目标风格迁移网络模型。
应理解,特征层用于表示从图像中提取的抽象特征,该抽象特征例如可以为颜色的深浅。
应理解,第一风格变换矩阵用于表示两帧输入的图像之间的色度偏差量。其中,第一风格变换矩阵包括呈多行多列排布的多个色度偏差量,每个色度偏差量对应为两帧输入的图像相同位置处的色度之间的差值。色度偏差系数用于表示两帧输入的图像之间的色度偏差幅度,其中,色度偏差系数越大,色度偏差越大,色度偏差系数越小,色度偏差越小。
可选地,第一特征提取网络模型和第二特征提取网络模型均可以为resnet模型、vgg模型、mobilenet中的任意一种。
其中,第一特征提取网络模型和第二特征提取网络模型可以相同,也可以不相同,本申请实施例对此不进行任何限制。
应理解,由于进行训练时输入的是位于RGB域的训练图像,因此,训练好的目标风格迁移网络模型可以用于确定位于RGB域的图像之间的第一风格变换矩阵或色度偏差系数。
此处,还应理解,为了增强训练好的模板风格迁移网络模型的处理能力,第一训练图像和第二训练图像之间除了暗区颜色不同之外,还可以具有其他区别,比如,亮区的颜色也不同。具体可以根据需要来采集训练图像,本申请实施例对此不进行任何限制。
可选地,作为第二种可实现方式,可以利用多对位于YUV域的训练图像,对初始风格迁移网络模型进行训练,以确定目标风格迁移网络模型。
利用多对位于YUV域的训练图像,对初始风格迁移网络模型进行训练的过程与上述第一种实施例对应的过程类似,可以参考上述描述,在此不再赘述。
应理解,由于进行训练时输入的是位于YUV域的训练图像,因此,训练好的目标风格迁移网络模型可以用于确定位于YUV域的图像之间的第一风格变换矩阵或色度偏差系数。
可选地,作为第三种可实现方式,如图7所示,可以利用多对位于RGB域的训练图像,对初始风格迁移网络模型进行训练,以确定目标风格迁移网络模型。
在进行训练时,先将每对训练图像中的第一训练图像和第二训练图像进行拼接,得到拼接训练图像;然后,利用特征网络模型对拼接训练图像进行处理,得到对应的 拼接特征层;再利用初始风格迁移网络模型对拼接特征层进行处理,得到对应的第一风格变换矩阵或色度偏差系数。
可选地,作为第四种可实现方式,可以利用多对位于YUV域的训练图像,对初始风格迁移网络模型进行训练,以确定目标风格迁移网络模型。
其训练过程与第三种实施例对应的过程类似,可以参考上述描述,在此不再赘述。
在本申请实施例中,通过利用深度学习网络模型,来对多帧曝光图像进行处理,得到第一融合图像,再对第一融合图像进行第一后端处理,得到第一后端图像;然后,复用一帧曝光时间较长、更接近真实场景颜色的长曝光图像,对长曝光图像进行第二后端处理,得到第二后端图像。基于此,利用目标风格迁移网络模型将第一后端图像和长曝光图像对应的第二后端图像进行二次融合,并且,在进行二次融合的同时将长曝光图像对应的第二后端图像的风格迁移到第一后端图像上,对第一后端图像的暗区进行颜色校正,从而得到接近真实场景颜色的目标图像。
图8示出了本申请实施例提供的另一种图像处理方法的流程示意图。该方法应用于电子设备。
如图8所示,本申请实施例提供的图像处理方法可以包括S210至S260,下面对这些步骤进行详细的描述。
S210、获取多帧曝光图像。
多帧曝光图像的曝光时间不同,多帧曝光图像包括至少1帧第一长曝光图像。第一长曝光图像的曝光时间相对于多帧曝光图像中的其他曝光图像的曝光时间较长。
S220、利用深度学习网络模型,对多帧曝光图像进行处理,得到多帧曝光图像对应的第一融合图像。
S230、对第一融合图像进行第一后端处理,得到第一融合图像对应的第一后端图像。
其中,上述S210至S230与上述S110至S130的过程相同,可以参考上述描述,在此不再赘述。
S240、获取第二长曝光图像。
其中,第二长曝光图像的曝光时间相对于多帧曝光图像中除第一长曝光图像之外的其他曝光图像的曝光时间较长。
应理解,第二长曝光图像的曝光时间可以大于或者等于第一长曝光图像的曝光时间。
应理解,此处可以获取1帧或多帧第二长曝光图像。当获取多帧第二长曝光图像时,多帧第二长曝光图像的曝光时间可以相同,也可以不相同。
在一些实施例中,电子设备可以包括一个或多个图像传感器,那么,电子设备可以控制该一个或多个图像传感器进行拍摄,从而得到多帧曝光图像和第二长曝光图像。在另一些实施例中,无论电子设备是否包括图像传感器,电子设备都可以从本地存储或者从其他设备获取多帧曝光图像和第二长曝光图像。又或者,在另一实施例中,电子设备可以控制该一个或多个图像传感器进行拍摄,从而得到多帧曝光图像,而通过 本地存储或者从其他设备获取第二长曝光图像。当然,在实际应用过程中,电子设备还可以通过其他方式来获取第二长曝光图像,本申请实施例对此不进行任何限制。
应理解,第二长曝光图像可以是直接由图像传感器生成的曝光图像,也可以是由对该曝光图像进行一种或多种处理操作之后得到的图像。
应理解,第二长曝光图像为拜尔格式图像,也即,均为RAW域的图像。
应理解,第二长曝光图像和多帧曝光图像是对同一待拍摄场景连续拍摄的图像,其中,拍摄第二长曝光图像和多帧曝光图像的次序可以根据需要进行,本申请实施例对此不进行任何限制。
S250、对第二长曝光图像进行第二后端处理,得到第二后端图像。
针对第二后端处理的描述,与上述S140中的描述相同,在此不再赘述。
S260、利用目标风格迁移网络模型,对第一后端图像和第二后端图像进行风格迁移处理,得到目标图像。
可选地,如图8所示,在上述S260之前,所述方法还可以包括以下S251至S253。
S251、判断第一后端图像对应的环境亮度值(lux)是否小于预设环境亮度值。
S252、若否,则不再进行处理,将第一后端图像作为目标图像输出。
S253、若是,则利用目标迁移网络模型,对第一后端图像和第二后端图像进行风格迁移处理,得到目标图像。
其中,可以利用光线传感器来感知获取第一后端图像时,周围环境对应的环境亮度值并进行存储。例如,若通过光线传感器确定出获取第一后端图像时,周围环境对应的环境亮度值为120,大于预设环境亮度值100,由此,可以确定周围的环境光照较好,得到的第一后端图像的颜色能得到保证,不会出现较大的颜色偏差,进而不需要进行风格迁移处理,直接将第一后端图像作为目标图像输出。
若通过光线传感器确定出获取第一后端图像时,周围环境对应的环境亮度值为60,远远小于预设环境亮度值100,此时,可以确定周围的环境光照非常差,得到的第一后端图像的颜色不能得到保证,暗区会出现较大的颜色偏差。那么,为了解决暗区的颜色偏差问题,需要利用目标迁移风格网络模型,来对第一后端图像和第二后端图像进行风格迁移处理,以得到接近实景颜色,暗区没有色彩偏差的目标图像。
此处,通过对第一后端图像对应的环境亮度值进行判断,可以将颜色偏差不严重的图像筛选出来,对这一部分图像不进行风格迁移处理,以减少计算量,提高处理效率。
在本申请实施例中,通过利用深度学习网络模型,来对多帧曝光图像进行处理,得到第一融合图像,再对第一融合图像进行第一后端处理,得到第一后端图像;然后,再采集一帧曝光时间较长、更接近真实场景颜色的长曝光图像,对长曝光图像进行第二后端处理,得到第二后端图像。基于此,利用目标风格迁移网络模型将第一后端图像和长曝光图像对应的第二后端图像进行二次融合,并且,在进行二次融合的同时将长曝光图像对应的第二后端图像的风格迁移到第一后端图像上,对第一后端图像的暗区进行颜色校正,从而得到接近真实场景颜色的目标图像。
在此基础上,还可以对第一后端图像的环境亮度值进行筛选,当第一后端图像符合要求时,说明图像颜色偏差不严重,可以不对第一后端图像再进行处理,直接输出, 而在仅对不符合要求的第一后端图像才进行上述一系列处理,以对暗区的颜色进行校正。
结合上述图2和图8,利用目标风格迁移网络模型,对第一后端图像和第二后端图像进行风格迁移处理时,可以利用以下两种实现方式进行处理。
可选地,作为一种可实现方式,图9示出了一种对第一后端图像和第二后端图像进行风格迁移处理的流程示意图。如图9所示,该过程可以包括S310至S350。
S310、对第一后端图像和第二后端图像分别进行YUV域转RGB域处理,得到第一后端图像对应的第一中间图像,第二后端图像对应的第二中间图像。
其中,第一后端图像和第二后端图像均位于YUV域,经YUV域转RGB域之后,得到的第一中间图像和第二中间图像均位于RGB域。
S320、利用目标风格迁移网络模型,对第一中间图像和第二中间图像进行处理,得到第一风格变换矩阵。
应理解,该目标风格迁移网络模型为利用位于RGB域的训练图像训练生成的,这样,目标风格迁移网络模型才能对位于RGB域的第一中间图像和第二中间图像进行处理。
S330、对第一风格变换矩阵进行上采样,得到第二风格变换矩阵。
其中,上采样指的是放大图像,此处是将原来较小尺寸的第一风格变换矩阵,放大成较大尺寸的第二风格变换矩阵。放大之后的第二风格变换矩阵的尺寸与第一中间图像、第二中间图像对应的尺寸相同,第一后端图像、第二后端图像与第一中间图像、第二中间图像的尺寸相同,那么,第二风格变换矩阵与第一后端图像、第二后端图像的尺寸也相同。
第二风格变换矩阵包括的是多行多列个像素分别对应的色度偏差量。
应理解,在上述S320中,目标风格迁移网络模型为了提取第一风格变换矩阵,会进行下采样,从而使得第一风格变换矩阵的尺寸相对于第一中间图像和第二中间图像会缩小。例如,第一中间图像和第二中间图像的尺寸为512×512×3,得到的第一风格变换矩阵的尺寸为16×16×6。因此,在S330中,需要对第一风格变换矩阵进行上采样,使其尺寸增大,得到与第一后端图像尺寸相同的第二风格变换矩阵,以便于后续利用该第二风格变换矩阵对第一后端图像继续进行处理。
S340、确定第二中间图像对应的掩膜图像。
当第二中间图像是由第一长曝光图像进行处理后得到的时,则确定第二中间图像对应的掩膜图像相当于是确定第一长曝光图像对应的掩膜图像;当第二中间图像是由第二长曝光图像进行处理后得到的时,则确定第二中间图像对应的掩膜图像相当于是确定第二长曝光图像对应的掩膜图像。
其中,掩膜图像用于对第一长曝光图像或第二长曝光图像中的暗区进行掩膜,以便于后续进行风格迁移时,对图像中的亮度和暗区分开进行处理。掩膜图像为二值图像。比如,可以根据每个像素对应的亮度大小来生成掩膜图像。
例如,将第二中间图像中的像素按照亮度大小划分成亮区和暗区,位于亮区的像 素生成掩膜图像时,对应的取值为0,也即在掩膜图像中呈白色;而位于暗区的像素生成掩膜图像时,对应的取值为1,也即在掩膜图像中呈黑色。
S350、对第一后端图像、第二风格变换矩阵和掩膜图像进行融合处理,得到目标图像。
可选地,可以利用以下公式(一)确定目标图像:
Muv(i,j)=Suv(i,j)×[1-N(i,j)]+Luv(i,j)×N(i,j)  公式(一)
其中,Suv(i,j)用于表示第一后端图像中位于第i行第j列像素位置处的色度值,Luv(i,j)用于表示第二风格变换矩阵中位于第i行第j列位置处的色度偏差值,N(i,j)用于表示掩膜图像中位于第i行第j列像素位置处的取值为0或1,Muv(i,j)用于表示目标图像中位于第i行第j列像素位置处的目标色度值。
在此基础上,还可以在公式(一)的基础上,增加权重,利用公式(二)来确定目标图像:
Muv(i,j)=a×Suv(i,j)×[1-N(i,j)]+b×Luv(i,j)×N(i,j)  公式(二)
其中,a用于表示为第一后端图像分配的第一权重,b用于表示为第二风格变换矩阵分配的第二权重。基于此,通过调整融合时的权重大小,可以调节原色度值和色度偏差值的占比,更加精确的控制生成的目标图像的色度值。
应理解,上述公式仅对图像的色度值进行了处理,得到了目标图像对应的目标色度值,目标图像对应的亮度值可以根据第一后端图像相同位置处的亮度值进行确定。比如,目标图像对应的亮度值可以等于第一后端图像相同位置处的亮度值。基于此,可以得到目标图像中每个像素对应的亮度值和目标色度值,由此得到的目标图像为位于YUV域的图像。
在另一示例中,若目标风格迁移网络模型是利用位于YUV域的训练图像训练生成的,则无需进行转域,目标风格迁移网络模型可直接对位于YUV域的第一后端图像和第二后端图像进行处理,得到第一风格变换矩阵。再对第一风格变换矩阵进行上采样,得到第二风格变换矩阵。然后,确定第二后端图像的掩膜图像。对第一后端图像、第二风格变换矩阵和掩膜图像进行融合处理,得到目标图像。
具体过程可参考上述S330至S350中的描述,在此不再赘述。
可选地,作为另一种可实现方式,图10示出了另一种对第一后端图像和第二后端图像进行风格迁移处理的流程示意图。如图10所示,该过程可以包括S410至S460。
S410、对第一后端图像和第二后端图像分别进行YUV域转RGB域处理,得到第一后端图像对应的第一中间图像,第二后端图像对应的第二中间图像。
其中,第一后端图像和第二后端图像均位于YUV域,经转域之后得到的第一中间图像和第二中间图像均位于RGB域。
S420、利用目标风格迁移网络模型,对第一中间图像和第二中间图像进行处理,得到色度偏差系数。
应理解,该目标风格迁移网络模型为利用位于RGB域的训练图像训练生成的,这样,目标风格迁移网络模型才能对位于RGB域的第一中间图像和第二中间图像进行处理。
应理解,色度偏差系数用于表示亮度与色度偏差值之间的对应关系。为了后续可以更精细的对图像的色度进行调整,可以对应图像中的不同区域得到不同的色度偏差系数,或者,也可以对应图像中的不同像素位置得到不同的色度偏差系数。
S430、根据第一中间图像和色度偏差系数,确定第一风格变换矩阵。
其中,可以利用以下公式(三)确定第一风格变换矩阵:
L'uv(i,j)=f[Y(i,j)*k(i,j)]  公式(三)
其中,Y(i,j)用于表示第一中间图像位于第i行第j列像素位置处的亮度值,k(i,j)用于表示位于第i行第j列位置处的色度偏差系数,L'uv(i,j)用于表示第一风格变换矩阵中位于第i行第j列位置处的色度偏差值,f用于表示Y(i,j)*k(i,j)与L'uv(i,j)具有函数映射关系。
S440、对第一风格变换矩阵进行上采样,得到第二风格变换矩阵。
S450、确定第二中间图像对应的掩膜图像。
S460、对第一后端图像、第二风格变换矩阵和掩膜图像进行融合处理,得到目标图像。
针对S440至S460描述,与上述S330至S350中的描述相同,在此不再赘述。
在另一示例中,若目标风格迁移网络模型是利用位于YUV域的训练图像训练生成的,则无需进行转域,目标风格迁移网络模型可直接对位于YUV域的第一后端图像和第二后端图像进行处理,得到色度偏差系数;根据第一后端图像和色度偏差系数,确定第一风格变换矩阵。再对第一风格变换矩阵进行上采样,得到第二风格变换矩阵。然后,确定第二后端图像的掩膜图像。对第一后端图像、第二风格变换矩阵和掩膜图像进行融合处理,得到目标图像。
具体过程可参考上述S430至S460中的描述,在此不再赘述。
图11为本申请实施例提供的一种效果示意图。
在光照度较低的夜景环境中,若利用现有的图像处理方法,采集多帧曝光图像进行融合之后,可能得到如图11中的(a)所示的第一融合图像。此时,由于光照度较低,第一融合图像的暗区面积比较大,第一融合图像中属于暗区的天空、地面等都出现了偏色的问题,因此,导致用户体验非常不好。
而当复用或重新采集一帧如图11中的(b)所示的长曝光图像,利用本申请实施例提供的图像处理方法对第一融合图像和长曝光图像进行处理之后,则可以得到如图11中的(c)所示的目标图像。此时,由于长曝光图像相对于得到第一融合图像的多帧曝光图像的曝光时间较长,可以很好地还原出真实场景的颜色,从而可以实现有效地风格迁移,将长曝光图像的风格迁移至目标图像上,提高目标图像的色彩还原效果。
上文结合图1至图11详细描述了本申请实施例提供的图像处理方法以及相关的显示界面和效果图;下面将结合图12至图15详细描述本申请实施例提供的电子设备、 装置和芯片。应理解,本申请实施例中的电子设备、装置和芯片可以执行前述本申请实施例的各种图像处理方法,即以下各种产品的具体工作过程,可以参考前述方法实施例中的对应过程。
图12示出了一种适用于本申请的电子设备的结构示意图。电子设备100可以用于实现上述方法实施例中描述的方法。
电子设备100可以是手机、智慧屏、平板电脑、可穿戴电子设备、车载电子设备、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、投影仪等等,本申请实施例对电子设备100的具体类型不作任何限制。
电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
处理器110可以包括一个或多个处理单元。例如,处理器110可以包括以下处理单元中的至少一个:应用处理器(application processor,AP)、调制解调处理器、图形处理器(graphics processing unit,GPU)、图像信号处理器(image signal processor,ISP)、控制器、视频编解码器、数字信号处理器(digital signal processor,DSP)、基带处理器、神经网络处理器(neural-network processing unit,NPU)。其中,不同的处理单元可以是独立的器件,也可以是集成的器件。
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
图12所示的各模块间的连接关系只是示意性说明,并不构成对电子设备100的各模块间的连接关系的限定。可选地,电子设备100的各模块也可以采用上述实施例中多种连接方式的组合。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显 示信息。
显示屏194用于显示图像,视频等。
摄像头193用于捕获图像或视频。可以通过应用程序指令触发开启,实现拍照功能,如拍摄获取任意场景的图像。摄像头可以包括成像镜头、滤光片、图像传感器等部件。物体发出或反射的光线进入成像镜头,通过滤光片,最终汇聚在图像传感器上。图像传感器主要是用于对拍照视角中的所有物体(也可称为待拍摄场景、目标场景,也可以理解为用户期待拍摄的场景图像)发出或反射的光汇聚成像;滤光片主要是用于将光线中的多余光波(例如除可见光外的光波,如红外)滤去;图像传感器主要是用于对接收到的光信号进行光电转换,转换成电信号,并输入处理器130进行后续处理。其中,摄像头193可以位于电子设备100的前面,也可以位于电子设备100的背面,摄像头的具体个数以及排布方式可以根据需求设置,本申请不做任何限制。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行电子设备100的各种功能应用以及数据处理。
内部存储器121还可以存储本申请实施例提供的图像处理方法的软件代码,当处理器110运行所述软件代码时,执行图像处理方法的流程步骤,得到清晰度较高的图像。
内部存储器121还可以存储拍摄得到的图像。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐等文件保存在外部存储卡中。
当然,本申请实施例提供的图像处理方法的软件代码也可以存储在外部存储器中,处理器110可以通过外部存储器接口120运行所述软件代码,执行图像处理方法的流程步骤,得到清晰度较高的图像。电子设备100拍摄得到的图像也可以存储在外部存储器中。
应理解,用户可以指定将图像存储在内部存储器121还是外部存储器中。比如,电子设备100当前与外部存储器相连接时,若电子设备100拍摄得到1帧图像时,可以弹出提示信息,以提示用户将图像存储在外部存储器还是内部存储器;当然,还可以有其他指定方式,本申请实施例对此不进行任何限制;或者,电子设备100检测到内部存储器121的内存量小于预设量时,可以自动将图像存储在外部存储器中。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
在本申请实施例中,摄像头193可以捕获多帧曝光图像,处理器110对该多帧曝光图像进行图像处理,图像处理可以包括降噪、动态范围调整、色调映射、转域、上 采样、融合处理等,通过该图像处理,得到具有较好的色彩效果的目标图像。然后,处理器110可以控制显示屏194呈现处理后的目标图像,该目标图像即为光照度较低的场景下拍摄得到的图像。
上文详细描述了电子设备100的硬件系统,下面介绍电子设备100的软件系统。软件系统可以采用分层架构、事件驱动架构、微核架构、微服务架构或云架构,本申请实施例以分层架构为例,示例性地描述电子设备100的软件系统。
如图13所示,采用分层架构的软件系统分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,软件系统可以分为五层,从上至下分别为应用层210、应用框架层220、硬件抽象层230、驱动层240以及硬件层250。
应用层210可以包括相机、图库,还可以包括日历、通话、地图、导航、WLAN、蓝牙、音乐、视频、短信息等应用程序。
应用框架层220为应用层210的应用程序提供应用程序访问接口和编程框架。
例如,应用框架层包括相机访问接口,该相机访问接口用于通过相机管理和相机设备来提供相机的拍摄服务。
应用框架层220中的相机管理用于管理相机。相机管理可以获取相机的参数,例如判断相机的工作状态等。
应用框架层220中的相机设备用于提供不用相机设备以及相机管理之间的数据访问接口。
硬件抽象层230用于将硬件抽象化。比如,硬件抽象层可以包相机硬件抽象层以及其他硬件设备抽象层;相机硬件抽象层中可以包括相机设备1、相机设备2等;相机硬件抽象层可以与相机算法库相连接,相机硬件抽象层可以调用相机算法库中的算法。
驱动层240用于为不同的硬件设备提供驱动。比如,驱动层可以包括相机驱动;数字信号处理器驱动以及图形处理器驱动。
硬件层250可以包括传感器、图像信号处理器、数字信号处理器、图形处理器以及其他硬件设备。其中,传感器可以包括传感器1、传感器2等,还可以包括深度传感器(time of flight,TOF)和多光谱传感器等,对此,本申请实施例没有任何限制。
下面结合显示拍照场景,示例性说明电子设备100的软件系统的工作流程。
当用户在触摸传感器180K上进行单击操作时,相机APP被单击操作唤醒后,通过相机访问接口调用相机硬件抽象层的各个相机设备。示例性的,相机硬件抽象层可以通过向相机设备驱动下发调用某一摄像头的指令,同时相机算法库开始加载本申请实施例所利用的深度学习网络模型和目标风格迁移网络模型。
当硬件层的传感器被调用后,例如,调用某一摄像头中的传感器1获取多帧曝光时间不同的曝光图像,将该多帧曝光图像返回给硬件抽象层,利用加载的相机算法库中的深度学习网络模型进行降噪、多曝光融合等处理,生成第一融合图像;然后调用图像信号处理器对第一融合图像进行动态范围调整、色调映射、RGB域转YUV域等处理,同时,调用图像处理器对复用的第一长曝光图像或额外采集的第二长曝光图像进行RGB域转YUV域 处理;再利用加载的相机算法库中的目标风格迁移网络模型进行风格迁移处理,得到目标图像。
将得到的目标图像经相机硬件抽象层、相机访问接口发送回相机应用进行显示和存储。
下面将结合图14详细描述本申请的装置实施例。应理解,本申请实施例中的装置可以执行前述本申请实施例的各种方法,即以下各种产品的具体工作过程,可以参考前述方法实施例中的对应过程。
图14是本申请实施例提供的一种图像处理装置300的结构示意图。该图像处理装置300包括获取模块310与处理模块320。
其中,获取模块310用于获取多帧曝光图像,多帧曝光图像的曝光时间不同,多帧曝光图像包括至少1帧第一长曝光图像,第一长曝光图像的曝光时间相对于多帧曝光图像中的其他曝光图像的曝光时间较长。
处理模块320用于对多帧曝光图像进行风格迁移处理,得到目标图像。
可选地,作为一种实施例,处理模块320还用于:
利用深度学习网络模型,对多帧曝光图像进行处理,得到第一融合图像;
对第一融合图像进行第一后端处理,得到第一后端图像。
可选地,作为一种实施例,处理模块320还用于:
对第一长曝光图像进行第二后端处理,得到第二后端图像。
可选地,作为一种实施例,处理模块320还用于:
获取第二长曝光图像,第二长曝光图像的曝光时间相对于多帧曝光图像中除第一长曝光图像之外的其他曝光图像的曝光时间较长;
对第二长曝光图像进行第二后端处理,得到第二后端图像。
可选地,作为一种实施例,处理模块320还用于:
利用目标风格迁移网络模型,对第一后端图像和第二后端图像进行风格迁移处理,得到目标图像。
可选地,作为一种实施例,处理模块320还用于:
判断第一后端图像对应的环境亮度值是否小于预设环境亮度值;
若否,则将第一后端图像作为目标图像输出;
若是,则利用目标风格迁移网络模型,对第一后端图像和所述第二后端图像进行风格迁移处理,得到目标图像。
可选地,作为一种实施例,处理模块320还用于:
利用目标风格迁移网络模型,对第一后端图像和第二后端图像进行处理,得到第一风格变换矩阵;
对第一风格变换矩阵进行上采样,得到第二风格变换矩阵;
确定第二后端图像对应的掩膜图像;
对第一后端图像、第二风格变换矩阵和掩膜图像进行融合处理,得到目标图像。
可选地,作为一种实施例,处理模块320还用于:
利用目标风格迁移网络模型对第一后端图像和第二后端图像进行处理,得到色度 偏差系数;
根据第一后端图像和色度偏差系数,确定第一风格变换矩阵。
可选地,作为一种实施例,处理模块320还用于:
对第一后端图像和第二后端图像分别进行YUV域转RGB域处理;
利用目标风格迁移网络模型对第一后端图像和第二后端图像进行处理,包括:
利用目标风格迁移网络模型,对进行了YUV域转RGB域处理后的第一后端图像和第二后端图像进行处理。
可选地,作为一种实施例,处理模块320还用于:
利用多对训练图像,对初始风格迁移网络模型进行训练,确定目标风格迁移网络模型;其中,多对训练图像均位于RGB域,每对训练图像的内容相同但暗区对应的颜色不同。
可选地,作为一种实施例,处理模块320还用于:
利用多对训练图像,对初始风格迁移网络模型进行训练,确定目标风格迁移网络模型;其中,多对训练图像均位于YUV域,每对训练图像的内容相同但暗区对应的颜色不同。
可选地,作为一种实施例,处理模块320还用于:
针对每对训练图像中的每帧训练图像,利用1个特征网络模型进行处理,得到对应的特征层;
利用每对训练图像对应的2个特征层对初始风格迁移网络模型进行训练,得到目标风格迁移网络模型。
可选地,作为一种实施例,处理模块320还用于:
将每对训练图像包括的2帧训练图像进行拼接,得到拼接训练图像;
利用特征网络模型对拼接训练图像进行处理,得到对应的拼接特征层;
利用拼接特征层对初始风格迁移网络模型进行训练,得到目标风格迁移网络模型。
可选地,作为一种实施例,深度学习网络模型为Unet模型、LLnet模型和FCN模型中的任意一种。
可选地,作为一种实施例,第一后端处理包括:RGB域转YUV域。
可选地,作为一种实施例,第一后端处理还包括:动态范围调整、色调映射中的至少一项。
可选地,作为一种实施例,第二后端处理包括:RAW域转YUV域。
可选地,作为一种实施例,目标风格迁移网络模型为Resnet模型、vgg模型、unet模型、vnet模型中的任意一种。
需要说明的是,上述图像处理装置300以功能模块的形式体现。这里的术语“模块”可以通过软件和/或硬件形式实现,对此不作具体限定。
例如,“模块”可以是实现上述功能的软件程序、硬件电路或二者结合。所述硬件电路可能包括应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。
因此,在本申请的实施例中描述的各示例的单元,能够以电子硬件、或者计算机 软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令;当所述计算机可读存储介质在折叠屏夹角的确定装置上运行时,使得该图像处理装置300执行前述所示的图像处理方法。
所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本申请实施例还提供了一种包含计算机指令的计算机程序产品,当其在图像处理装置300上运行时,使得图像处理装置300可以执行前述所示的图像处理方法。
图15为本申请实施例提供的一种芯片的结构示意图。图15所示的芯片可以为通用处理器,也可以为专用处理器。该芯片包括处理器401。其中,处理器401用于支持图像处理装置300执行前述所示的技术方案。
可选的,该芯片还包括收发器402,收发器402用于接受处理器401的控制,用于支持图像处理装置300执行前述所示的技术方案。
可选的,图15所示的芯片还可以包括:存储介质403。
需要说明的是,图15所示的芯片可以使用下述电路或者器件来实现:一个或多个现场可编程门阵列(field programmable gate array,FPGA)、可编程逻辑器件(programmable logic device,PLD)、控制器、状态机、门逻辑、分立硬件部件、任何其他适合的电路、或者能够执行本申请通篇所描述的各种功能的电路的任意组合。
上述本申请实施例提供的电子设备、图像处理装置300、计算机存储介质、计算机程序产品、芯片均用于执行上文所提供的方法,因此,其所能达到的有益效果可参考上文所提供的方法对应的有益效果,在此不再赘述。
应理解,上述只是为了帮助本领域技术人员更好地理解本申请实施例,而非要限制本申请实施例的范围。本领域技术人员根据所给出的上述示例,显然可以进行各种等价的修改或变化,例如,上述检测方法的各个实施例中某些步骤可以是不必须的,或者可以新加入某些步骤等。或者上述任意两种或者任意多种实施例的组合。这样的修改、变化或者组合后的方案也落入本申请实施例的范围内。
还应理解,上文对本申请实施例的描述着重于强调各个实施例之间的不同之处,未提到的相同或相似之处可以互相参考,为了简洁,这里不再赘述。
还应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
还应理解,本申请实施例中,“预先设定”、“预先定义”可以通过在设备(例如,包括电子设备)中预先保存相应的代码、表格或其他可用于指示相关信息的方式来实现,本申请对于其具体的实现方式不做限定。
还应理解,本申请实施例中的方式、情况、类别以及实施例的划分仅是为了描述的方便,不应构成特别的限定,各种方式、类别、情况以及实施例中的特征在不矛盾的情况下可以相结合。
还应理解,在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (23)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    显示第一界面,所述第一界面包括第一控件;
    检测到对所述第一控件的第一操作;
    响应于所述第一操作,获取多帧曝光图像,多帧所述曝光图像的曝光时间不同,多帧所述曝光图像包括至少1帧第一长曝光图像,所述第一长曝光图像的曝光时间相对于多帧所述曝光图像中的其他曝光图像的曝光时间较长;
    对多帧所述曝光图像进行风格迁移处理,得到目标图像。
  2. 根据权利要求1所述的方法,其特征在于,在对多帧所述曝光图像进行风格迁移处理,得到目标图像之前,所述方法还包括:
    利用深度学习网络模型,对多帧所述曝光图像进行处理,得到第一融合图像;
    对所述第一融合图像进行第一后端处理,得到第一后端图像。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    对所述第一长曝光图像进行第二后端处理,得到第二后端图像。
  4. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    获取第二长曝光图像,所述第二长曝光图像的曝光时间相对于多帧所述曝光图像中除所述第一长曝光图像之外的其他曝光图像的曝光时间较长;
    对所述第二长曝光图像进行第二后端处理,得到第二后端图像。
  5. 根据权利要求3或4所述的方法,其特征在于,对多帧所述曝光图像进行风格迁移处理,得到目标图像,包括:
    利用目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行风格迁移处理,得到所述目标图像。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    判断所述第一后端图像对应的环境亮度值是否小于预设环境亮度值;
    若否,则将所述第一后端图像作为所述目标图像输出;
    若是,则利用所述目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行风格迁移处理,得到所述目标图像。
  7. 根据权利要求5或6所述的方法,其特征在于,利用目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行处理,得到所述目标图像,包括:
    利用所述目标风格迁移网络模型,对所述第一后端图像和所述第二后端图像进行处理,得到第一风格变换矩阵;
    对所述第一风格变换矩阵进行上采样,得到第二风格变换矩阵;
    确定所述第二后端图像对应的掩膜图像;
    对所述第一后端图像、所述第二风格变换矩阵和所述掩膜图像进行融合处理,得到所述目标图像。
  8. 根据权利要求7所述的方法,其特征在于,利用所述目标风格迁移网络模型对所述第一后端图像和所述第二后端图像进行处理,得到第一风格变换矩阵,包括:
    利用所述目标风格迁移网络模型对所述第一后端图像和所述第二后端图像进行处 理,得到色度偏差系数;
    根据所述第一后端图像和所述色度偏差系数,确定所述第一风格变换矩阵。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    对所述第一后端图像和所述第二后端图像分别进行YUV域转RGB域处理;
    利用所述目标风格迁移网络模型对所述第一后端图像和所述第二后端图像进行处理,包括:
    利用所述目标风格迁移网络模型,对进行了YUV域转RGB域处理后的所述第一后端图像和所述第二后端图像进行处理。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型;其中,多对所述训练图像均位于RGB域,每对所述训练图像的内容相同但暗区对应的颜色不同。
  11. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型;其中,多对所述训练图像均位于YUV域,每对所述训练图像的内容相同但暗区对应的颜色不同。
  12. 根据权利要求10或11所述的方法,其特征在于,利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型,包括:
    针对每对所述训练图像中的每帧所述训练图像,利用1个特征网络模型进行处理,得到对应的特征层;
    利用每对所述训练图像对应的2个特征层对所述初始风格迁移网络模型进行训练,得到所述目标风格迁移网络模型。
  13. 根据权利要求10或11所述的方法,其特征在于,利用多对训练图像,对初始风格迁移网络模型进行训练,确定所述目标风格迁移网络模型,包括:
    将每对所述训练图像包括的2帧所述训练图像进行拼接,得到拼接训练图像;
    利用特征网络模型对所述拼接训练图像进行处理,得到对应的拼接特征层;
    利用所述拼接特征层对所述初始风格迁移网络模型进行训练,得到所述目标风格迁移网络模型。
  14. 根据权利要求12或13所述的方法,其特征在于,所述特征提取网络模型为resnet模型、vgg模型、mobilenet中的任意一种。
  15. 根据权利要求2所述的方法,其特征在于,所述深度学习网络模型为Unet模型、LLnet模型和FCN模型中的任意一种。
  16. 根据权利要求2所述的方法,其特征在于,所述第一后端处理包括:RGB域转YUV域。
  17. 根据权利要求16所述的方法,其特征在于,所述第一后端处理还包括:动态范围调整、色调映射中的至少一项。
  18. 根据权利要求3或4所述的方法,其特征在于,所述第二后端处理包括:RAW域转YUV域。
  19. 根据权利要求5至13中任一项所述的方法,其特征在于,所述目标风格迁移 网络模型为Resnet模型、vgg模型、unet模型、vnet模型中的任意一种。
  20. 一种电子设备,其特征在于,所述电子设备包括:
    一个或多个处理器和存储器;
    所述存储器与所述一个或多个处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,所述一个或多个处理器调用所述计算机指令以使得所述电子设备执行如权利要求1至19中任一项所述的图像处理方法。
  21. 一种芯片系统,其特征在于,所述芯片系统应用于电子设备,所述芯片系统包括一个或多个处理器,所述处理器用于调用计算机指令以使得所述电子设备执行如权利要求1至19中任一项所述的图像处理方法。
  22. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储了计算机程序,当所述计算机程序被处理器执行时,使得处理器执行权利要求1至19中任一项所述的图像处理方法。
  23. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码被处理器执行时,使得处理器执行权利要求1至19中任一项所述的图像处理方法。
PCT/CN2022/113424 2021-12-31 2022-08-18 图像处理方法及其相关设备 Ceased WO2023124123A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22913408.5A EP4340383B1 (en) 2021-12-31 2022-08-18 Image processing method and related device thereof
US18/574,139 US20240320794A1 (en) 2021-12-31 2022-08-18 Image processing method and related device thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111677018.1A CN116416122B (zh) 2021-12-31 2021-12-31 图像处理方法及其相关设备
CN202111677018.1 2021-12-31

Publications (1)

Publication Number Publication Date
WO2023124123A1 true WO2023124123A1 (zh) 2023-07-06

Family

ID=86997396

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/113424 Ceased WO2023124123A1 (zh) 2021-12-31 2022-08-18 图像处理方法及其相关设备

Country Status (4)

Country Link
US (1) US20240320794A1 (zh)
EP (1) EP4340383B1 (zh)
CN (1) CN116416122B (zh)
WO (1) WO2023124123A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116740220A (zh) * 2023-08-16 2023-09-12 海马云(天津)信息技术有限公司 模型构建的方法和装置、照片生成方法和装置
CN118629081A (zh) * 2024-08-13 2024-09-10 华东交通大学 用于红外-可见光人脸识别的双编码人脸合成方法与系统

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116847204B (zh) * 2023-08-25 2025-02-18 荣耀终端有限公司 一种目标识别方法、电子设备及存储介质
CN117830075A (zh) * 2023-11-23 2024-04-05 北京洞窝数字科技有限公司 一种基于家居图片的风格迁移方法
CN117395495B (zh) * 2023-12-08 2024-05-17 荣耀终端有限公司 一种图像处理方法及电子设备
CN119277212B (zh) * 2024-01-10 2025-12-09 荣耀终端股份有限公司 图像处理方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090080791A1 (en) * 2007-09-20 2009-03-26 Huawei Technologies Co., Ltd. Image generation method, device, and image synthesis equipment
US20110050950A1 (en) * 2008-09-03 2011-03-03 Kenichiroh Nomura Image device and imaging method
US20150195441A1 (en) * 2012-09-20 2015-07-09 Huawei Technologies Co., Ltd. Image Processing Method and Apparatus
CN106412448A (zh) * 2016-02-03 2017-02-15 周彩章 一种基于单帧图像的宽动态范围处理方法与系统
CN110958401A (zh) * 2019-12-16 2020-04-03 北京迈格威科技有限公司 一种超级夜景图像颜色校正方法、装置和电子设备

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019145767A1 (en) * 2018-01-25 2019-08-01 King Abdullah University Of Science And Technology Deep-learning based structure reconstruction method and apparatus
WO2019183813A1 (zh) * 2018-03-27 2019-10-03 华为技术有限公司 一种拍摄方法及设备
US11107205B2 (en) * 2019-02-18 2021-08-31 Samsung Electronics Co., Ltd. Techniques for convolutional neural network-based multi-exposure fusion of multiple image frames and for deblurring multiple image frames
US11151702B1 (en) * 2019-09-09 2021-10-19 Apple Inc. Deep learning-based image fusion for noise reduction and high dynamic range
CN112529775A (zh) * 2019-09-18 2021-03-19 华为技术有限公司 一种图像处理的方法和装置
US11893482B2 (en) * 2019-11-14 2024-02-06 Microsoft Technology Licensing, Llc Image restoration for through-display imaging
US11178368B2 (en) * 2019-11-26 2021-11-16 Adobe Inc. Automatic digital parameter adjustment including tone and color correction
CN112102154B (zh) * 2020-08-20 2024-04-26 北京百度网讯科技有限公司 图像处理方法、装置、电子设备和存储介质
CN113824873B (zh) * 2021-08-04 2022-11-15 荣耀终端有限公司 一种图像处理的方法及相关电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090080791A1 (en) * 2007-09-20 2009-03-26 Huawei Technologies Co., Ltd. Image generation method, device, and image synthesis equipment
US20110050950A1 (en) * 2008-09-03 2011-03-03 Kenichiroh Nomura Image device and imaging method
US20150195441A1 (en) * 2012-09-20 2015-07-09 Huawei Technologies Co., Ltd. Image Processing Method and Apparatus
CN106412448A (zh) * 2016-02-03 2017-02-15 周彩章 一种基于单帧图像的宽动态范围处理方法与系统
CN110958401A (zh) * 2019-12-16 2020-04-03 北京迈格威科技有限公司 一种超级夜景图像颜色校正方法、装置和电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4340383A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116740220A (zh) * 2023-08-16 2023-09-12 海马云(天津)信息技术有限公司 模型构建的方法和装置、照片生成方法和装置
CN116740220B (zh) * 2023-08-16 2023-10-13 海马云(天津)信息技术有限公司 模型构建的方法和装置、照片生成方法和装置
CN118629081A (zh) * 2024-08-13 2024-09-10 华东交通大学 用于红外-可见光人脸识别的双编码人脸合成方法与系统

Also Published As

Publication number Publication date
EP4340383A4 (en) 2024-11-27
US20240320794A1 (en) 2024-09-26
CN116416122A (zh) 2023-07-11
EP4340383B1 (en) 2025-10-01
EP4340383A1 (en) 2024-03-20
CN116416122B (zh) 2024-04-16

Similar Documents

Publication Publication Date Title
WO2023124123A1 (zh) 图像处理方法及其相关设备
CN118175436B (zh) 图像处理方法及其相关设备
WO2019183813A1 (zh) 一种拍摄方法及设备
CN114693580B (zh) 图像处理方法及其相关设备
WO2023077939A1 (zh) 摄像头的切换方法、装置、电子设备及存储介质
CN117135293B (zh) 图像处理方法和电子设备
CN116437222B (zh) 图像处理方法与电子设备
CN116709042B (zh) 一种图像处理方法和电子设备
WO2023036034A1 (zh) 图像处理方法及其相关设备
CN118741315B (zh) 一种图像处理方法和电子设备
CN115550575B (zh) 图像处理方法及其相关设备
WO2023077938A1 (zh) 生成视频帧的方法、装置、电子设备及存储介质
CN120186483A (zh) 图像处理方法、装置、电子设备、可读存储介质和程序产品
WO2023131028A1 (zh) 图像处理方法及其相关设备
CN116668862B (zh) 图像处理方法与电子设备
CN116051368A (zh) 图像处理方法及其相关设备
CN116437198A (zh) 图像处理方法与电子设备
CN117135438B (zh) 图像处理的方法及电子设备
CN108259768A (zh) 图像的选取方法、装置、存储介质及电子设备
CN120640116A (zh) 图像处理方法及电子设备
CN118870197A (zh) 一种图像处理方法及相关装置
CN116233625A (zh) 一种图像处理方法、电子设备及芯片

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2022913408

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022913408

Country of ref document: EP

Effective date: 20231213

WWE Wipo information: entry into national phase

Ref document number: 18574139

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWG Wipo information: grant in national office

Ref document number: 2022913408

Country of ref document: EP