WO2022001509A1 - 图像优化方法、装置、计算机存储介质以及电子设备 - Google Patents
图像优化方法、装置、计算机存储介质以及电子设备 Download PDFInfo
- Publication number
- WO2022001509A1 WO2022001509A1 PCT/CN2021/096024 CN2021096024W WO2022001509A1 WO 2022001509 A1 WO2022001509 A1 WO 2022001509A1 CN 2021096024 W CN2021096024 W CN 2021096024W WO 2022001509 A1 WO2022001509 A1 WO 2022001509A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- optimized
- target
- network
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/37—Determination of transform parameters for the alignment of images, i.e. image registration using transform domain methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/24—Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20028—Bilateral filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20032—Median filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present application relates to the field of artificial intelligence technology, in particular, to image optimization technology.
- the image In the process of imaging, transmission and acquisition, the image will inevitably be affected by external interference and imperfect transmission equipment, which will cause more noise in the image, and lose the original details and become blurred. In order to restore the original details of the image, it is necessary to optimize the image.
- the first method is to repair the noise and blur of the image through one or more image filtering methods.
- the second method is to optimize through neural networks. It uses neural network to super-score the image to optimize the image, but the above three methods have the problems of poor denoising effect, poor sharpening effect, or adding details that do not conform to the original image into the image, which reduces the user experience.
- the present application provides an image optimization method, device, computer storage medium, and electronic device, so as to optimize images at least to a certain extent, improve image quality, and further improve user experience.
- an image optimization method comprising: acquiring an image to be optimized, performing alignment processing on the image to be optimized, to obtain an image to be optimized, and points of each object in a target area of the image to be optimized and aligned Distributed in standard locations; input the aligned images to be optimized into a generation network, and perform feature extraction on the aligned images to be optimized through the generation network to obtain optimized images; wherein the generation network is based on low-quality image pairs. and the joint loss function is obtained by training the generative adversarial deep neural network model to be trained, and the low-quality image pair includes a target image and a low-quality image corresponding to the target image.
- an image optimization device the device includes: an acquisition module for acquiring an image to be optimized; an alignment module for performing an alignment process on the to-be-optimized image to obtain an alignment to be optimized image, the points of each object in the target area of the image to be optimized and aligned are distributed in standard positions; the optimization module is used to input the aligned image to be optimized into the generation network, and the aligned image to be optimized is processed by the generation network.
- the generation network is obtained by training a generative adversarial deep neural network model to be trained according to a low-quality image pair and a joint loss function, and the low-quality image pair includes a target image and the target image Corresponding low-quality images.
- a computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, the image optimization method described in the first aspect above is implemented.
- an electronic device for image optimization comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the The instructions are executable to perform the image optimization method described in the first aspect.
- a computer program product for executing the image optimization method described in the first aspect above when the computer program product is executed.
- FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied;
- FIG. 2 shows a schematic flowchart of an image optimization method in an exemplary embodiment of the present application
- FIG. 3 shows a schematic flowchart of obtaining an image to be optimized for alignment according to an embodiment of the present application
- 4A, 4B, and 4C show schematic interface diagrams of performing alignment processing on a face image according to an embodiment of the present application
- FIG. 5 shows a schematic flowchart of training a generative adversarial deep neural network model according to an embodiment of the present application
- FIG. 6 shows a schematic flowchart of acquiring multiple low-quality image pairs according to an embodiment of the present application
- FIG. 7 shows a schematic structural diagram of a generative adversarial deep neural network model to be trained according to an embodiment of the present application
- FIG. 8 shows a schematic structural diagram of a generation network according to an embodiment of the present application.
- FIG. 9 shows a schematic structural diagram of a post-processing network according to an embodiment of the present application.
- FIG. 10 shows a general flow chart of a training process for generating an adversarial deep neural network model according to a face image to be trained according to an embodiment of the present application
- Figure 11A, Figure 11B, Figure 11C, Figure 11D, Figure 11E, Figure 11F show interface schematic diagrams of three groups of using the trained generation network to optimize face images according to an embodiment of the present application;
- FIG. 12 shows a schematic structural diagram of an image optimization apparatus according to an embodiment of the present application.
- FIGS. 13A, 13B, 13C, and 13D are schematic diagrams of interfaces for optimizing low-quality images according to an embodiment of the present application.
- FIG. 14 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
- FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
- the system architecture 100 may include a mobile terminal 101 , an information transmission terminal 102 , a network 103 and a server 104 .
- the above-mentioned mobile terminal 101 can be a terminal device with a camera and a display screen such as a mobile phone, a portable computer, a tablet computer, etc.
- the information transmission terminal 102 can be a kind of intelligent terminal, such as an intelligent electronic device equipped with various operating systems, etc.
- 103 is a medium used to provide a communication link between the mobile terminal 101 and the server 104, and between the information transmission terminal 102 and the server 104.
- the network 103 may include various connection types, such as wired communication links, wireless communication links, etc.
- the network 103 between the mobile terminal 101 and the information transmission terminal 102 may provide a communication link through a wireless network
- the network 103 between the mobile terminal 101 and the server 104 and the network 103 between the information transmission terminal 102 and the server 104 may be wireless communication links, specifically a mobile network.
- the numbers of terminals eg, mobile terminals 101, information transmission terminals 102
- networks and servers in FIG. 1 are merely illustrative. There can be any number of terminals, networks and servers according to implementation needs.
- the server 104 may be a server cluster composed of multiple servers, etc., and may be used to store information related to image optimization processing.
- the mobile terminal 101 sends the image to be optimized to the server 104; the server 104 performs alignment processing on the image to be optimized, and obtains the aligned image to be optimized corresponding to the image to be optimized;
- the server 104 inputs the to-be-optimized aligned image into the generation network, performs feature extraction on the to-be-optimized aligned image through the generation network to obtain an optimized image, and returns the optimized image to the mobile terminal 101 .
- the generation network is obtained by training the generative adversarial deep neural network model to be trained according to the low-quality image pair and the joint loss function, and the low-quality image pair includes the target image and the low-quality image corresponding to the target image.
- the mobile terminal 101 after acquiring the image to be optimized, sends the image to be optimized to the information transmission terminal 102; the information transmission terminal 102 aligns the image to be optimized, and obtains the image to be optimized corresponding to the image to be optimized. Align the images, and then send the aligned images to be optimized to the server 104; the server 104 inputs the aligned images to be optimized into the generation network, and performs feature extraction on the aligned images to be optimized through the generation network to obtain the optimized images, and will optimize the images.
- the generation network is obtained by training the generative adversarial deep neural network model to be trained according to the low-quality image pair and the joint loss function, and the low-quality image pair includes the target image and the low-quality image corresponding to the target image.
- the mobile terminal 101 after acquiring the image to be optimized, performs alignment processing on the image to be optimized, obtains the image to be optimized corresponding to the image to be optimized, and then sends the image to be optimized to the server 104;
- the server 104 inputs the to-be-optimized aligned image into the generation network, performs feature extraction on the to-be-optimized aligned image through the generation network to obtain an optimized image, and returns the optimized image to the mobile terminal 101 .
- the generation network is obtained by training the generative adversarial deep neural network model to be trained according to the low-quality image pair and the joint loss function, and the low-quality image pair includes the target image and the low-quality image corresponding to the target image.
- the image optimization method provided by the embodiment of the present application is generally executed by the server 104 , and accordingly, the image optimization apparatus is generally set in the server 104 .
- the terminal may also have functions similar to the server, so as to execute the image optimization solution provided by the embodiments of the present application.
- the first is to deblur the image through image processing, which requires one or more image filtering methods to The noise and blur of the image are repaired; the second is to use the neural network to clear the image; the third is to use the neural network to perform image super-score for clarity.
- the above three methods have certain defects correspondingly.
- For the first method its processing is more one-sided than the neural network. Since it cannot fully fit the noise and fuzzy distribution in reality, it cannot achieve a good denoising effect;
- For the second method it mainly focuses on the image sharpening methods of general scenes, and the low-quality image methods used are uneven. If the combinations used in the process of low-quality images are not rich enough, the neural network cannot be very good. Fitting the distribution of real blurred images, resulting in poor clarity of the generated images. In addition, for images of different sizes, no normalization is performed, which will also result in different results due to different sizes of regions when processing related images.
- an image optimization model is determined based on training a generative adversarial deep neural network model, and an image to be processed is optimized through the image optimization model.
- the low-quality image can be processed by denoising, sharpening, generating details, etc., so that the low-quality image can maintain the characteristics of the original image, and can be clearer, with higher image quality and better user experience.
- the technical solution of the present application has low cost of processing images and wide application range.
- the generative adversarial deep neural network model is a type of neural network model. Compared with the traditional neural network model, its main feature is that it has a discriminative network structure in addition to the generative network structure.
- the generative network is used to generate images, while The discriminative network is used to judge the authenticity of an image (including the target image and the generated image).
- iterative training is performed by calculating the difference between the generated image and the target image, and the error of the discriminant network judging the image.
- the network parameters of the generator network are optimized, so that the generated images are close to the target requirements. Therefore, the generative adversarial deep neural network model can generate high-quality images because of the mutual confrontation between the generative network and the discriminant network.
- the image optimization method provided by the embodiments of the present application is implemented based on a generative confrontation deep neural network model, and relates to the technical field of artificial intelligence.
- Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
- artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence.
- Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
- an image optimization method is provided, which overcomes the defects existing in the related art at least to a certain extent.
- the execution subject of the image optimization method provided in this embodiment may be a device with computing processing functions, such as a server or a terminal device, or may be executed jointly by a server and a terminal device, wherein the terminal device and the server may be respectively shown in FIG. 1 .
- Mobile terminal 101 and server 104 are shown.
- the image optimization method of the present application can be used to optimize any low-quality image, for example, it can process low-quality images of face images, animal images, images of buildings with fixed structures, etc., to restore and improve image details , the image optimization method provided by the embodiments of the present application will be described in detail below, taking the server as the execution subject and taking the low-quality image as an example of a human face image as an example.
- FIG. 2 shows a schematic flowchart of an image optimization method in an exemplary embodiment of the present application.
- the image optimization method provided by this embodiment specifically includes the following steps:
- the image to be optimized is a low-quality image, and the low-quality image is mainly manifested in poor image definition and a lot of noise.
- the image to be optimized is a low-quality human face image.
- the low-quality face image may be an image obtained by a user by using a terminal device with a camera and an imaging unit to photograph the face of the target person or the part containing the face, and the face in the image may be presented at any angle , as long as the facial features of the character can be obtained from it; the to-be-optimized image can also be an image containing a human face downloaded by the user through the network, and so on.
- an alignment process is performed on the to-be-optimized image to obtain an to-be-optimized aligned image.
- the image to be optimized as a low-quality face image as an example
- the standard frontal face position ie, the standard position
- a standard position template can be used to correct the face image, and the standard position template is the point distribution of each object in a specific area, specifically the standard position template of the face image, which is the face area
- the point coordinates of the facial features when the face is in a standard frontal face position can be obtained according to a large number of face data statistics to form a five-point coordinate template, that is, a standard position template, wherein the five points include two points marking the left and right eyes, and marking the tip of the nose. Point, mark the two points on the left and right corners of the mouth.
- the average value of all the coordinate information corresponding to the same part can be used as the point coordinates corresponding to the part in the five-point coordinate template, for example, the coordinate information corresponding to the left eye in all face data can be obtained, Then the coordinate information of all the left eyes is added and averaged to obtain the point coordinates of the left eye in the standard position template.
- the image to be optimized can be aligned according to the standard position template, that is, the face in the image to be optimized is corrected to a standard frontal face position.
- FIG. 3 shows a schematic flow chart of acquiring an image to be optimized and aligned. As shown in FIG. 3 , the process of acquiring an image to be optimized and aligned includes S301-S303:
- a target area in the image to be optimized is detected, and the target area is of the same type as the specific area.
- the image to be optimized needs to be aligned according to the standard position template, it is necessary to determine the target area in the image to be optimized that is of the same type as the specific area corresponding to the standard position template, that is, the specific area It is the same as the object corresponding to the target area, for example, it is a human face area, an animal face area, and so on.
- the standard position template is the template corresponding to the face area, then the face area needs to be extracted from the image to be optimized, and then aligned according to the standard position template.
- the image to be optimized may also include other parts of the human body.
- the half-body photo includes the neck and upper body in addition to the face area.
- a model that can be used for face recognition can be used for identification, and the face area can be determined by identifying the facial features. Since the face area in the image to be optimized needs to be aligned, the standard position template used is also the template corresponding to the face area.
- the transformation matrix between the two can be determined according to the image data corresponding to the face region and the data corresponding to the standard position template. , and then, according to the transformation matrix, the face region in the image to be optimized can be modified into a face region aligned with the five-point coordinates in the standard position template.
- a transformation operation is performed on the image corresponding to the target area according to the transformation matrix, so as to obtain an image to be optimized and aligned.
- operations such as translation, rotation, scaling, etc. can be performed on the image to be optimized according to the transformation matrix, and then the image to be optimized can be normalized to a shape consistent with the standard frontal face position, to obtain the aligned images to be optimized.
- the face alignment process can also be performed in reverse, that is, the aligned face can be restored to the original captured face state through the inverse operation of the transformation matrix.
- Figures 4A, 4B, and 4C show schematic interface diagrams for aligning face images.
- the image to be optimized is a low-quality face image, in which the face region exists Inclined, not a standard frontal face position;
- Figure 4B is a standard position template, i.e. five-point coordinates corresponding to the face image; according to the standard position template shown in Figure 4B to the face image to be optimized shown in Figure 4A
- the face alignment image to be optimized as shown in FIG. 4C can be obtained, in which the face area conforms to the standard frontal face position.
- the aligned image to be optimized is input into a generation network, and feature extraction is performed on the aligned image to be optimized through the generation network to obtain an optimized image; wherein the generation network is based on low-quality image pairs and The joint loss function is obtained by training the generative adversarial deep neural network model to be trained, and the low-quality image pair includes a target image and a low-quality image corresponding to the target image.
- the aligned image to be optimized can be input into the generation network, and the feature extraction of the aligned image to be optimized is performed by the generation network to obtain the optimized image, and the optimized image is Denoise and sharpen the image to be optimized to generate an image after facial detail processing.
- the generation network is a part of the generative adversarial deep neural network model, which can generate an optimized image corresponding to the input aligned image to be optimized. Before using the generative network to generate optimized images, it is necessary to train the generative adversarial deep neural network model to be trained in order to obtain a stable generative network.
- FIG. 5 shows a schematic flowchart of training a generative adversarial deep neural network model.
- the training process of training a generative adversarial deep neural network model specifically includes S501-S504:
- a low-quality image pair can be used as a training sample for training a generative adversarial deep neural network model to be trained, wherein the low-quality image is used as an input sample, and the target image corresponding to the low-quality image is a verification sample , which is used to judge whether the performance of the generated network is stable. That is to say, the low-quality images in each group of low-quality image pairs are images to be optimized, and the target images are images to be optimized.
- FIG. 6 shows a schematic flowchart of acquiring multiple low-quality image pairs. As shown in FIG. 6 , the process specifically includes S601-S604:
- a large number of clear images can be acquired as target images in advance, and the specific number can be determined according to actual needs. The more the number, the higher the performance of the model.
- the clear image is used as the target image.
- the face regions in each target image can be aligned to obtain aligned images.
- the face area in the target image can be aligned according to the standard position template. For example, the face area in the target image can be detected first, and then the point coordinates of the facial features in the face area can be extracted. Finally, according to the extracted The point coordinates of the facial features are aligned with those of the facial features in the standard location template to obtain multiple aligned images.
- a low-quality image pair is formed according to the target image and the low-quality image corresponding to the target image.
- a low-quality image may be formed by performing low-quality processing on each aligned image.
- the low-quality processing may include adding noise processing and/or blurring processing, wherein the noise adding processing includes adding one or more of Gaussian noise, Poisson noise, and salt and pepper noise, and the blurring processing includes mean filtering, Gaussian filtering , one or more of median filtering, bilateral filtering, and reducing resolution.
- the noise types and fuzzification processing methods added in the embodiments of the present application are not limited to the above-mentioned types, and may also include other types of noise and/or fuzzification processing methods, which will not be repeated in this application.
- ⁇ represents the mean of the distribution
- ⁇ represents the standard deviation of the distribution
- ⁇ 2 represents the variance of the distribution.
- ⁇ and ⁇ can be determined randomly. After the parameters are determined, noise is added to the color value of each pixel in the image according to the probability distribution, and finally the color value of the pixel is scaled to [0, 255] to achieve Gaussian The addition of noise.
- the parameter ⁇ can be randomly determined. After the parameters are determined, the color value of each pixel in the image can be processed according to the probability distribution of Poisson noise to add Poisson noise.
- Salt and pepper noise is to randomly add black and white pixels to the image.
- the number of black and white pixels can be controlled by the signal-to-noise ratio, and the signal-to-noise ratio can be randomly determined.
- the total number of pixels can be determined according to the signal-to-noise ratio, and then randomly obtain the position of the pixel to add noise in the image area corresponding to the total number of pixels, and set the pixel value of this position to 255 or 0 , and finally repeat the above steps for other pixels in the image area to complete adding salt and pepper noise to the image.
- the mean filter is to take the average value of the target pixel and surrounding pixels and then fill it to the target pixel. Its expression is shown in formula (3):
- M represents the size of the coefficient template
- f(x, y) represents the pixel value of the target pixel in the image and the surrounding pixels corresponding to M
- s represents all the pixels in the image
- Gaussian filtering uses the normal distribution to calculate the transformation of each pixel in the image, and its expression is shown in formula (4):
- u 2 +v 2 represents the blur radius, and its size can be determined randomly. After the blur radius and variance are determined, the color value of each pixel in the image can be converted according to the normal distribution, so as to realize the blurring of the image.
- Median filtering is to set the pixel value of each pixel to the median of all pixel values in a certain neighborhood window of the point, and the kernel size of the determined domain window can be randomly determined.
- Bilateral filtering is a different edge-preserving filtering method that considers both the spatial position (spatial kernel) and the pixel value (value field kernel).
- the kernel size of the spatial position ie the radius of the Gaussian filter
- the size of the value field kernel can be Determined by random.
- Reducing the resolution can reduce the image quality by first reducing the resolution randomly and then upsampling back to the original resolution.
- the degree of reducing the resolution may be determined randomly.
- a combination of methods for a target image such as a low-quality face image
- a target image such as a low-quality face image
- the low-quality images formed by the processed low-quality images are used to train the generative adversarial deep neural network model to be trained, which can improve the processing accuracy of the model for various low-quality images.
- each low-quality image pair is used as a target image pair respectively, and the low-quality image in the target image pair is input to the generation network in the generative confrontation deep neural network model to be trained to obtain a generated image.
- the target image in the pair of the generated image and the target image is input to the post-processing network in the generative adversarial deep neural network model to be trained, and the target image in the pair of the generated image and the target image is processed by the post-processing network to construct Joint loss function.
- parameters of the generative adversarial deep neural network model to be trained are optimized according to the joint loss function to obtain a generative network.
- FIG. 7 shows a schematic structural diagram of the generative adversarial deep neural network model to be trained.
- the generative adversarial deep neural network model 700 to be trained includes a generation network 701 and a post-processing network 702, wherein the generation network 701 is used for input
- the low-quality image is processed to output the generated image
- the post-processing network 702 is used to construct a joint loss function according to the generated image and the target image output by the generation network 701, and optimize the parameters of the model based on the joint loss function.
- FIG. 8 shows a schematic structural diagram of a generation network.
- the generation network 800 provided in this embodiment includes a downsampling layer 801 , a residual network layer 802 and an upsampling layer 803 .
- the number of residual network layers 802 may be set to multiple, for example, 4, 5, and so on.
- the downsampling layer 801 may include multiple convolutional layers with different sizes
- the upsampling layer 803 may also include multiple convolutional layers with different sizes. Since the convolution operation in the downsampling process can be used to extract The deep features of the image, but compared with the input image, multiple convolution operations make the obtained feature map keep getting smaller, resulting in information loss.
- the size of the feature map can be restored to the size of the input image by up-sampling after the feature extraction at the residual network layer, thereby reducing the loss of original information during the network transmission process, thereby improving the network efficiency.
- the structure inconsistency or semantic inconsistency in the half output process will ultimately improve the image quality after optimization.
- the number and size of the convolutional layers included in the down-sampling layer 801 and the up-sampling layer 803 can be set according to actual needs. ⁇ 256 two convolutional layers, then the upsampling layer 803 may include two convolutional layers with sizes of 256 ⁇ 256 and 512 ⁇ 512 sequentially from front to back.
- Figure 9 shows a schematic structural diagram of the post-processing network.
- the post-processing network 900 includes a discriminant network 901, a classification network 902 and a segmentation network 903, wherein the discriminant network 901 may include multiple convolutional layers for Feature extraction is performed on the target image and the generated image, and the confidence level of the target image and the generated image is judged;
- the classification network 902 may be a classification network such as VGG, which is used to perform feature extraction on the target image and the generated image to obtain the corresponding classification.
- the segmentation network 903 can be a commonly used segmentation network, which is used to segment the target image, and determine the target image and generate the image according to the position information of each object in the target image.
- the image information corresponding to the position information of the same object in the image for example, segment the face image to obtain the position information of the facial features in the face image, and determine the target image according to the position information of the facial features and generate the image corresponding to the position information of the facial features
- the image information of the image itself can be ensured by using the segmentation network 903 to ensure the consistency of the pixels of the image itself.
- a joint loss function can be constructed according to the processing results of the target image and the generated image. Based on the joint loss function, the parameters of the generative adversarial deep neural network model to be trained can be reversely adjusted. After several iterations After training, a generative adversarial deep neural network model with stable loss function and stable performance can be obtained, and then a generative network for optimizing the low-quality images to be optimized can be obtained.
- a loss function when constructing a joint loss function, can be constructed through a discriminant network, a classification network, and a segmentation network respectively. network, obtain the first discrimination result and the second discrimination result, and construct the first loss function according to the first discrimination result and the second discrimination result; input the target image in the pair of the generated image and the target image to the classification network, and obtain the first image information and the second image information, and construct a second loss function according to the first image information and the second image information; input the target image in the generated image and the target image pair to the segmentation network, and obtain the first partial image information and the second partial image information image information, and construct a third loss function according to the first partial image information and the second partial image information; finally construct a joint loss function according to the first loss function, the second loss function and the third loss function.
- the generation network G is used to optimize the low-quality image (input image), and output the image after the optimization process as the generated image.
- the discrimination network D receives the above-mentioned generated image and the target image corresponding to the above-mentioned low-quality image (input image), and discriminates whether an image (including the target image and the generated image) is true or false.
- the training goal of the discriminant network D is: to discriminate the target image as true, and to discriminate the generated image as false.
- the training goal of the generation network G is: to optimize the low-quality image (input image) to obtain the generated image that makes the discrimination result of the discriminant network true, that is, the closer the generated image is to the target image, so as to achieve the effect of being fake and real, so the first A loss function includes a discriminator loss function and a generator loss function.
- the network D is determined based on the generated image to generate a first determination result D (G (z i)) , while the target image to generate a first determination result D (x i), where z i is input
- the data of the low-quality image to the generation network G(z i ) is the data of the generated image output by the generation network after optimizing the low-quality image
- xi is the data of the target image corresponding to the low-quality image
- D(G (z i )) and D(x i ) are the binary classification confidence of the generated image or the target image output by the discriminant network.
- the discriminator loss function LossD and the generator loss function LossG can be defined as shown in equations (5)-(6), respectively:
- x i is the data of the target image corresponding to the image of low quality
- z i is input to the low quality image data generation network
- i is any one of a low quality image
- m is the total number of low-quality image pair.
- the parameters of the generative network can be fixed first, and the parameters of the discriminant network can be optimized according to formula (5), so that the discriminant network has a discriminant accuracy rate of up to Preset threshold; then fix the optimized parameters of the discriminant network unchanged, and optimize the parameters of the generation network according to formula (6), so that the generation network can generate a clear optimized image.
- the generated image and the target image should be close in terms of low-level pixel values and high-level abstract features. Therefore, in order to ensure that the generated image and the target image are consistent in deep semantics,
- the generated image and the target image can also be compared through the classification network, and a perceptual loss function can be constructed according to the comparison result, and then the parameters of the classification network and the generation network can be optimized based on the perceptual loss function.
- the first image information can be obtained by the classification network processing the generated image
- the second image information can be obtained by processing the target image
- the second image information can be determined according to the first image information and the second image information corresponding to each low-quality image pair.
- the loss function, the perceptual loss function can be obtained by the classification network processing the generated image
- the second image information can be obtained by processing the target image
- the loss function, the perceptual loss function can be determined according to the first image information and the second image information corresponding to each low-quality image pair.
- the second loss function is determined according to the first image information and the second image information corresponding to each low-quality image pair. Specifically, the first image information and the second image information corresponding to each low-quality image pair can be subtracted to obtain The image information is poor; and the second loss function is constructed according to the corresponding image information difference of all low-quality image pairs, and the expression of the second loss function is shown in formula (7):
- V(G(z i )) is the first image information
- V(x i ) is the second image information
- i is any low-quality image pair
- m is the total number of low-quality image pairs.
- the parameters of the generation network can be optimized based on the second loss function, so that the generated image output by the generation network is close to or the same as the target image.
- the image information corresponding to the same object in the generated image and the target image can be compared. If the generated image and the target image correspond to the same object The image information is similar or the same, indicating that the generated image is similar or the same as the target image.
- the target image may be segmented through a segmentation network to obtain position information of each object in the image.
- the segmentation network can be used to segment the face image to obtain the position information of the facial features segmentation area, including the position information of the left and right eyes, the position information of the nose, and the position information of the mouth; then, according to the position information of the facial features segmentation area
- the image information of the corresponding area can be determined from the target image and the generated image.
- the image information corresponding to the position information of each object in the generated image can be used as the first partial image information, and the position information of the target image corresponding to each object can be used.
- the image information is used as the second partial image information; finally, the L1 norm between the first partial image information and the second partial image information is calculated, that is, the sum of the absolute values of the image information corresponding to the same object is calculated, and based on all low-quality images
- a third loss function is constructed for the corresponding L1 norm. The expression of the third loss function is shown in formula (8):
- M is the position information of each object area after segmentation, i is any low-quality image pair, and m is the total number of low-quality image pairs.
- a plurality of low-quality image pairs can be regarded as a batch of training samples, and according to the training samples, multiple rounds of iterative training can be performed on the generation adversarial deep neural network model to be trained until the preset number of trainings are completed or The individual loss functions tend to converge.
- the parameters of the model can be optimized through the first loss function, the second loss function and the third loss function in sequence.
- FIG. 10 shows a general flow chart of the training process of generating an adversarial deep neural network model for training according to the face image, and the target image is the target face image at this time, Low-quality images are low-quality face images. As shown in FIG.
- the target face image is aligned to obtain the target face alignment image; in S1002, the target face alignment image is subjected to adding noise processing and / or fuzzification to obtain a low-quality face image; in S1003, the low-quality face image is input into the generation network, and processed through the generation network to output a generated face image, and the generated face image is a pair of low-quality face image after optimization; in S1004, the generated face image and the target face image are paired, and input to the discriminant network, classification network and segmentation network in the post-processing network respectively, and through each network Perform feature extraction on the generated face image and the target face image to determine a joint loss function, the joint loss function includes a first loss function, a second loss function and a third loss function corresponding to each network, and further, according to the first loss function.
- the first loss function, the second loss function and the third loss function optimize the parameters of the generative adversarial deep neural network model to be trained until the generated face image is close to the target face image.
- the generative network in the generative adversarial deep neural network model to be trained can be used to optimize other face images to be optimized after alignment processing to obtain clear face images without noise and with facial details. , which further improves the user experience.
- the generation network after obtaining the optimized image output by the generation network, it can be determined whether to perform position reset processing on the optimized image according to the inclination angle of the face in the image to be optimized relative to the position of the standard frontal face, for example, when the optimized image is to be repositioned.
- the inclination angle of the face in the optimized image relative to the standard frontal face position is small (in terms of visual effect, the difference is not obvious), no processing is required for the optimized image.
- the face in the optimized image is relative to the standard frontal face
- the position repositioning process is performed on the image to be optimized. The object is restored to its original position and angle, and finally an optimized image corresponding to the original to-be-optimized image is obtained.
- FIG. 11A , FIG. 11B , FIG. 11C , FIG. 11D , FIG. 11E , and FIG. 11F show three groups of interface schematic diagrams for optimizing face images by using the trained generation network, as shown in FIG. 11A , as shown in Figure 11C and Figure 11E, the face images to be optimized are displayed. It can be seen that the face images to be optimized have large noise, low definition, and blurred edge contours. After the generated network, the aligned faces to be optimized After aligning the images for processing, a face image with high definition, clear edge outline and rich facial details can be obtained, as shown in the images shown in FIG. 11B , FIG. 11D , and FIG. 11F , respectively.
- the optimized image (that is, the optimized image) has the characteristics of high definition, including image details, and accurate noise removal on the basis of retaining the characteristics of the original image.
- the target images used in model training are all high-definition images. Therefore, when optimizing the image to be optimized according to the generated network after training, the edge contour information can be deepened and the missing parts in the image can be supplemented to a certain extent. That is to say, the image optimization method in this application also has certain image completion and restoration capabilities.
- the image optimization method of the present application is based on the generative adversarial deep neural network model, so it takes less time to optimize the image, and has the characteristics of high scalability and good portability.
- a processor including a CPU and a GPU
- the training of the above-mentioned generative adversarial deep neural network model is realized by GPU, or based on the trained generative adversarial deep neural network model, CPU or GPU is used to realize the optimization processing of the image to be optimized, etc.
- the computer program is executed by the processor, the above-mentioned functions defined by the above-mentioned methods provided in this application are performed.
- the program can be stored in a computer-readable storage medium, which can be a read-only memory, a magnetic disk, an optical disk, or the like.
- FIG. 12 shows a schematic structural diagram of an image optimization apparatus in an exemplary embodiment of the present application.
- the above image optimization apparatus 1200 includes: an acquisition module 1201 , an alignment module 1202 and an optimization module 1203 .
- the acquisition module 1201 is used to acquire the image to be optimized;
- the alignment module 1202 is used to perform alignment processing on the image to be optimized to obtain the aligned image to be optimized.
- the points of each object in the target area of the image to be optimized are distributed in standard position;
- the optimization module 1203 is used to input the aligned image to be optimized into the generation network, and perform feature extraction on the aligned image to be optimized through the generation network to obtain the optimized image; wherein the generation network is based on
- the low-quality image pair and the joint loss function are obtained by training the generative adversarial deep neural network model to be trained, and the low-quality image pair includes a target image and a low-quality image corresponding to the target image.
- the standard position template is the point distribution of each object in a specific area; the alignment module 1201 is configured to: detect a target area in the image to be optimized, and the target area is related to the specific area. The types of the regions are the same; a transformation matrix between the image data of the target region and the standard position template is determined; and a transformation operation is performed on the image of the target region according to the transformation matrix to obtain the to-be-optimized aligned image.
- the alignment module is configured to: perform alignment processing on the to-be-optimized image according to a standard position template to obtain the to-be-optimized aligned image.
- the standard location template is the point distribution of each object in a specific area.
- the alignment module is configured to detect a target area in the to-be-optimized image, where the target area is the same as the target area.
- the type of the specific area is the same; determine the transformation matrix between the image data of the target area and the standard position template; perform a transformation operation on the image corresponding to the target area according to the transformation matrix to obtain the target area. Optimize aligned images.
- the image optimization apparatus 1200 further includes: a low-quality image pair acquisition module, configured to acquire a plurality of the low-quality image pairs; a generating image acquisition module, used to obtain each low-quality image pair Respectively as target image pairs, input the low-quality images in the target image pairs to the generation network in the generative adversarial deep neural network model to be trained to obtain the generated images; the loss function building module is used to combine the generated images and all the generated images.
- the target image in the target image pair is input to the post-processing network in the generative adversarial deep neural network model to be trained, and the generated image and the target image in the target image pair are processed by the post-processing network.
- the joint loss function is constructed; a model parameter adjustment module is configured to optimize the parameters of the generative adversarial deep neural network model to be trained according to the joint loss function to obtain the generative network.
- the low-quality image pair acquisition module is configured to: acquire multiple target images, and perform alignment processing on the multiple target images to acquire multiple aligned images; The images are respectively subjected to low-quality processing to obtain a low-quality image corresponding to each of the target images; the low-quality image pair is formed according to the target image and the low-quality image corresponding to the target image.
- the low-quality processing includes adding noise processing and/or blurring.
- the adding noise processing includes adding one or more of Gaussian noise, Poisson noise, and salt and pepper noise
- the blurring processing includes mean filtering, Gaussian filtering, median filtering, and bilateral filtering. , one or more of reducing the resolution.
- the post-processing network includes a discriminant network, a classification network and a segmentation network;
- the loss function construction module includes: a first loss function construction unit for combining the generated image and the target image The target image in the center is input to the discrimination network, the first discrimination result and the second discrimination result are obtained, and the first loss function is constructed according to the first discrimination result and the second discrimination result;
- the second loss function construction unit for inputting the generated image and the target image in the target image pair to the classification network, obtaining first image information and second image information, and according to the first image information and the second image information information constructs a second loss function;
- a third loss function constructing unit is used to input the generated image and the target image in the target image pair to the segmentation network, and obtain first partial image information and second partial image information , and constructs a third loss function according to the first partial image information and the second partial image information;
- a joint loss function construction unit is configured to construct a third loss function according to the first loss function, the second loss
- the second loss function construction unit is configured to: subtract the first image information and the second image information corresponding to each of the low-quality image pairs to obtain the image information difference;
- the second loss function is constructed from the image information differences corresponding to all the low-quality image pairs.
- both the target image and the generated image in the target image pair include multiple objects; based on the foregoing solution, the third loss function construction unit is configured to: pass the segmentation network to all the objects.
- the target image is segmented to obtain the position information of each object in the target image; the image information corresponding to the position information of each object in the generated image is used as the first partial image information, and the The image information corresponding to the position information of each of the objects in the target image is used as the second partial image information.
- the third loss function construction unit is configured to: calculate the L1 norm between the first partial image information and the second partial image information; The corresponding L1 norm constructs the third loss function.
- the model parameter adjustment module is configured to: in each round of training, optimize the to-be-to-be-received function through the first loss function, the second loss function, and the third loss function in sequence.
- the parameters of the generative adversarial deep neural network model are trained to obtain the generative network.
- the parameter configuration of optimizing the generative adversarial deep neural network model to be trained by the first loss function is: fixing the parameters of the generative network unchanged, according to the first discrimination result and optimizing the parameters of the discriminating network with the second discriminating result; fixing the optimized parameters of the discriminating network unchanged, and optimizing the parameters of the generating network according to the first discriminating result.
- the generation network includes: a downsampling layer, a residual network layer, and an upsampling layer.
- an image optimization apparatus may be configured in a terminal device or a server, and when a user requests through the terminal device to perform optimization processing on a selected low-quality image, the image optimization method in the above embodiment may be executed , for an optimized image.
- FIG. 13A , FIG. 13B , FIG. 13C , and FIG. 13D show schematic diagrams of interfaces for optimizing low-quality images.
- the user can turn on the camera function in the terminal device, and in the interface "Please aim the camera at the face” can instruct the user to take a picture of the target face; then take a picture of the target face to obtain a low-quality face Image, low-quality face images are displayed in the photo browsing interface, as shown in Figure 13B; the photo browsing interface shown in Figure 13B includes a "reshoot” button and a "optimization processing” button, when the user selects a new image through the "reshoot” button.
- the user selects the low-quality face image in the photo browsing interface through the "Optimize” button, the user can perform the optimization process on the low-quality face image.
- the color of the "Optimization Processing” button can be changed, as shown in Figure 13C, to gray; then call the image optimization service to optimize the low-quality face image captured by the user, and obtain The optimized image returns to the optimized photo browsing interface, as shown in Figure 13D.
- the user may also select an image that has been photographed or downloaded from the atlas for optimization processing, and the specific processing flow is the same as the image optimization flow in the above-mentioned embodiment, which will not be repeated here.
- FIG. 14 shows a schematic structural diagram of a computer system suitable for implementing the electronic device according to the embodiment of the present application.
- the computer system 1400 includes a processor 1401, wherein the processor 1401 may include: a graphics processing unit (Graphics Processing Unit, GPU), a central processing unit (Central Processing Unit, CPU), which can be stored in a read-only A program in a memory (Read-Only Memory, ROM) 1402 or a program loaded from a storage section 1408 into a random access memory (Random Access Memory, RAM) 1403 executes various appropriate actions and processes. In the RAM 1403, various programs and data required for system operation are also stored.
- a processor (GPU/CPU) 1401, a ROM 1402, and a RAM 1403 are connected to each other through a bus 1404.
- An Input/Output (I/O) interface 1405 is also connected to the bus 1404 .
- Computer system 1400 also includes input portion 1406 , output portion 1407 , communication portion 1409 , drives 1410 , and removable media 1411 .
- embodiments of the present application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
- the computer program may be downloaded and installed from the network via the communication portion 1409, and/or installed from the removable medium 1411.
- the computer program is executed by the processor (GPU/CPU) 1401, various functions defined in the system of the present application are executed.
- the computer system 1400 may further include an AI (Artificial Intelligence, artificial intelligence) processor for processing computing operations related to machine learning.
- AI Artificial Intelligence, artificial intelligence
- the computer-readable medium shown in the embodiments of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
- the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
- Computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable Compact Disc Read-Only Memory (CD-ROM), optical storage device, magnetic storage device, or any suitable of the above The combination.
- a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein.
- Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
- Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to wireless, wired, etc., or any suitable combination of the foregoing.
- the units involved in the embodiments of the present application may be implemented in software or hardware, and the described units may also be provided in a processor. Among them, the names of these units do not constitute a limitation on the unit itself under certain circumstances.
- the present application also provides a computer-readable medium.
- the computer-readable medium may be included in the electronic device described in the above embodiments; it may also exist alone without being assembled into the electronic device. middle.
- the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by an electronic device, enables the electronic device to implement the methods described in the above-mentioned embodiments.
- the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present application may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on the network , which includes several instructions to cause a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.
- a computing device which may be a personal computer, a server, a touch terminal, or a network device, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (17)
- 一种图像优化方法,所述方法由具有计算处理功能的设备执行,所述方法包括:获取待优化图像;对所述待优化图像进行对齐处理,以获取待优化对齐图像,所述待优化对齐图像的目标区域中各对象的点分布在标准位置;将所述待优化对齐图像输入至生成网络中,通过所述生成网络对所述待优化对齐图像进行特征提取,以获取优化图像;其中所述生成网络是根据低质量图像对和联合损失函数对待训练生成对抗深度神经网络模型进行训练得到的,所述低质量图像对包括目标图像以及所述目标图像对应的低质量图像。
- 根据权利要求1所述的图像优化方法,对所述待优化图像进行对齐处理,以获取待优化对齐图像,包括:根据标准位置模板对所述待优化图像进行对齐处理,以获取所述待优化对齐图像。
- 根据权利要求2所述的图像优化方法,所述标准位置模板为特定区域中各对象的点分布,所述根据标准位置模板对所述待优化图像进行对齐处理,以获取所述待优化对齐图像,包括:检测所述待优化图像中的目标区域,所述目标区域与所述特定区域的类型相同;确定所述目标区域的图像数据与所述标准位置模板之间的变换矩阵;根据所述变换矩阵对所述目标区域对应的图像进行变换操作,以获取所述待优化对齐图像。
- 根据权利要求1所述的图像优化方法,在将所述待优化对齐图像输入至生成网络中之前,所述方法还包括:获取多个所述低质量图像对;将每个低质量图像对分别作为目标图像对,将所述目标图像对中的低质量图像输入至所述待训练生成对抗深度神经网络模型中的生成网络,得到生成图像;将所述生成图像和所述目标图像对中的目标图像输入至所述待训练生成对抗深度神经网络模型中的后处理网络,通过所述后处理网络对所述生成图像和所述目标图像对中的目标图像进行处理以构建所述联合损失函数;根据所述联合损失函数优化所述待训练生成对抗深度神经网络模型的参数,得到所述生成网络。
- 根据权利要求4所述的图像优化方法,所述获取多个低质量图像对,包括:获取多个目标图像;对所述多个目标图像分别进行对齐处理,以获取多个对齐图像;对所述多个对齐图像分别进行低质量化处理,以获取与每个所述目标图像分别对应的低质量图像;根据所述目标图像和所述目标图像对应的低质量图像形成所述低质量图像对。
- 根据权利要求5所述的图像优化方法,所述低质量化处理包括增加噪声处理和/或模糊化。
- 根据权利要求6所述的图像优化方法,所述增加噪声处理包括增加高斯噪声、泊松噪声、椒盐噪声中的一种或多种,所述模糊化处理包括均值滤波、高斯滤波、中值滤波、双边滤波、降低分辨率中的一种或多种。
- 根据权利要求4所述的图像优化方法,所述后处理网络包括判别网络、分类网络和分割网络,所述通过所述后处理网络对所述生成图像和所述目标图像对中的目标图像进行处理以构建所述联合损失函数,包括:将所述生成图像和所述目标图像对中的目标图像输入至所述判别网络,获取第一判别结果和第二判别结果,并根据所述第一判别结果和所述第二判别结果构建第一损失函数;将所述生成图像和所述目标图像对中的目标图像输入至所述分类网络,获取第一图像信息和第二图像信息,并根据所述第一图像信息和所述第二图像信息构建第二损失函数;将所述生成图像和所述目标图像对中的目标图像输入至所述分割网络,获取第一局部图像信息和第二局部图像信息,并根据所述第一局部图像信息和所述第二局部图像信息构建第三损失函数;根据所述第一损失函数、所述第二损失函数和所述第三损失函数构建所述联合损失函数。
- 根据权利要求8所述的图像优化方法,所述根据所述第一图像信息和所述第二图像信息构建第二损失函数,包括:将每个所述低质量图像对所对应的第一图像信息和第二图像信息相减,以获取图像信息差;根据所有所述低质量图像对所对应的图像信息差构建所述第二损失函数。
- 根据权利要求8所述的图像优化方法,所述目标图像对中的目标图像和所述生成图像均包括多个对象;所述将所述生成图像和所述目标图像对中的目标图像输入至所述分割网络,获取第一局部图像信息和第二局部图像信息,包括:通过所述分割网络对所述目标图像进行分割,以获取所述目标图像中每个对象的位置信息;将所述生成图像中与每个所述对象的位置信息对应的图像信息作为所述第一局部图像信息,并将所述目标图像中与每个所述对象的位置信息对应的图像信息作为所述第二局部图像信息。
- 根据权利要求10所述的图像优化方法,所述根据所述第一局部图像信息和所述第二局部图像信息构建第三损失函数,包括:计算所述第一局部图像信息和所述第二局部图像信息之间的L1范数;根据所有所述低质量图像对所对应的L1范数构建所述第三损失函数。
- 根据权利要求8所述的图像优化方法,所述根据所述联合损失函数优化所述待训练生成对抗深度神经网络模型的参数,得到所述生成网络,包括:在每轮训练过程中,依次通过所述第一损失函数、所述第二损失函数和所述第三损失函数优化所述待训练生成对抗深度神经网络模型的参数,得到所述生成网络。
- 根据权利要求12所述的图像优化方法,所述通过所述第一损失函数优化所述待训练生成对抗深度神经网络模型的参数,包括:固定所述生成网络的参数不变,根据所述第一判别结果和所述第二判别结果优化所述判别网络的参数;固定所述判别网络的优化后的参数不变,根据所述第一判别结果优化所述生成网络的参数。
- 一种图像优化装置,所述装置部署在具有计算处理功能的设备上,包括:获取模块,用于获取待优化图像;对齐模块,用于对所述待优化图像进行对齐处理,以获取待优化对齐图像,所述待优化对齐图像的目标区域中各对象的点分布在标准位置;优化模块,用于将所述待优化对齐图像输入至生成网络中,通过所述生成网络对所述待优化对齐图像进行特征提取,以获取优化图像;其中所述生成网络是根据低质量图像对和联合损失函数对待训练生成对抗深度神经网络模型进行训练得到的,所述低质量图像对包括目标图像以及所述目标图像对应的低质量图像。
- 一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现如权利要求1至13中任意一项所述的图像优化方法。
- 一种用于图像优化的电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1至13中任意一项所述的图像优化方法。
- 一种计算机程序产品,当所述计算机程序产品被执行时,用于执行如权利要求1至13中任意一项所述的图像优化方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022552468A JP7446457B2 (ja) | 2020-06-28 | 2021-05-26 | 画像最適化方法及びその装置、コンピュータ記憶媒体、コンピュータプログラム並びに電子機器 |
| EP21832144.6A EP4050511B1 (en) | 2020-06-28 | 2021-05-26 | Image optimisation method and apparatus, computer storage medium, and electronic device |
| US17/735,948 US12175640B2 (en) | 2020-06-28 | 2022-05-03 | Image optimization method and apparatus, computer storage medium, and electronic device |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010595618.2 | 2020-06-28 | ||
| CN202010595618.2A CN111488865B (zh) | 2020-06-28 | 2020-06-28 | 图像优化方法、装置、计算机存储介质以及电子设备 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/735,948 Continuation US12175640B2 (en) | 2020-06-28 | 2022-05-03 | Image optimization method and apparatus, computer storage medium, and electronic device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022001509A1 true WO2022001509A1 (zh) | 2022-01-06 |
Family
ID=71810596
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/096024 Ceased WO2022001509A1 (zh) | 2020-06-28 | 2021-05-26 | 图像优化方法、装置、计算机存储介质以及电子设备 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US12175640B2 (zh) |
| EP (1) | EP4050511B1 (zh) |
| JP (1) | JP7446457B2 (zh) |
| CN (1) | CN111488865B (zh) |
| WO (1) | WO2022001509A1 (zh) |
Families Citing this family (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102537207B1 (ko) * | 2019-12-30 | 2023-05-25 | 포항공과대학교 산학협력단 | 머신 러닝에 기반한 이미지 처리 방법 및 장치 |
| CN111488865B (zh) | 2020-06-28 | 2020-10-27 | 腾讯科技(深圳)有限公司 | 图像优化方法、装置、计算机存储介质以及电子设备 |
| CN112233207B (zh) * | 2020-10-16 | 2025-02-11 | 北京字跳网络技术有限公司 | 图像处理方法、装置、设备和计算机可读介质 |
| CN112488944A (zh) * | 2020-12-02 | 2021-03-12 | 北京字跳网络技术有限公司 | 样本生成和模型训练方法、装置、设备和计算机可读介质 |
| CN112651948B (zh) * | 2020-12-30 | 2022-04-12 | 重庆科技学院 | 一种基于机器视觉的青蒿素萃取智能跟踪与识别方法 |
| CN115131218A (zh) * | 2021-03-25 | 2022-09-30 | 腾讯科技(深圳)有限公司 | 图像处理方法、装置、计算机可读介质及电子设备 |
| CN113177982B (zh) * | 2021-04-16 | 2023-03-10 | 杭州睿影科技有限公司 | 安检图像数据的处理方法、装置、设备及系统 |
| CN113344832A (zh) * | 2021-05-28 | 2021-09-03 | 杭州睿胜软件有限公司 | 图像处理方法及装置、电子设备和存储介质 |
| CN113298807A (zh) * | 2021-06-22 | 2021-08-24 | 北京航空航天大学 | 一种计算机断层扫描图像处理方法及装置 |
| CN114299555A (zh) * | 2022-01-27 | 2022-04-08 | 敦泰电子(深圳)有限公司 | 指纹识别方法、指纹模组及电子设备 |
| US12020364B1 (en) * | 2022-04-07 | 2024-06-25 | Bentley Systems, Incorporated | Systems, methods, and media for modifying the coloring of images utilizing machine learning |
| CN114937187B (zh) * | 2022-06-16 | 2026-03-20 | 京东科技信息技术有限公司 | 一种图像优化方法、装置、设备和存储介质 |
| CN115147314B (zh) * | 2022-09-02 | 2022-11-29 | 腾讯科技(深圳)有限公司 | 图像处理方法、装置、设备以及存储介质 |
| CN117036180A (zh) * | 2022-10-13 | 2023-11-10 | 腾讯科技(深圳)有限公司 | 图像优化方法、装置、电子设备、介质和程序产品 |
| CN115619672B (zh) * | 2022-10-20 | 2026-02-06 | 深圳前海微众银行股份有限公司 | 一种图像处理方法、装置、设备及存储介质 |
| CN115689923A (zh) * | 2022-10-27 | 2023-02-03 | 佛山读图科技有限公司 | 低剂量ct图像降噪系统与降噪方法 |
| CN116363092A (zh) * | 2023-03-27 | 2023-06-30 | 中国船舶集团有限公司第七〇九研究所 | 一种基于神经网络的人群计数方法及装置 |
| CN116363013A (zh) * | 2023-04-06 | 2023-06-30 | 深圳市威富视界有限公司 | 图像处理方法及装置 |
| CN116385308B (zh) * | 2023-04-15 | 2024-05-07 | 广州海至亚传媒科技有限公司 | 联合图像处理优化策略选择系统 |
| CN116704286B (zh) * | 2023-06-02 | 2024-11-22 | 中国科学技术大学 | 一种高密度采样振动图像处理方法及存储介质 |
| CN116977214B (zh) * | 2023-07-21 | 2024-08-06 | 萱闱(北京)生物科技有限公司 | 图像优化方法、装置、介质和计算设备 |
| CN118095101A (zh) * | 2024-04-17 | 2024-05-28 | 中国石油大学(华东) | 一种基于机器学习的太赫兹超表面散射参数快速正演方法 |
| CN118097122B (zh) * | 2024-04-24 | 2024-11-22 | 中南大学 | 地下目标图像的识别方法、装置、电子设备及存储介质 |
| CN118394869B (zh) * | 2024-05-13 | 2025-01-24 | 临沂维测工程测绘有限公司 | 一种国土空间规划用地理信息采集方法及系统 |
| CN118587206B (zh) * | 2024-07-30 | 2024-11-22 | 维克多精密工业(深圳)有限公司 | 基于人工智能的模具结构健康监测方法及系统 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108537743A (zh) * | 2018-03-13 | 2018-09-14 | 杭州电子科技大学 | 一种基于生成对抗网络的面部图像增强方法 |
| CN109376582A (zh) * | 2018-09-04 | 2019-02-22 | 电子科技大学 | 一种基于生成对抗网络的交互式人脸卡通方法 |
| US20190114748A1 (en) * | 2017-10-16 | 2019-04-18 | Adobe Systems Incorporated | Digital Image Completion Using Deep Learning |
| CN111488865A (zh) * | 2020-06-28 | 2020-08-04 | 腾讯科技(深圳)有限公司 | 图像优化方法、装置、计算机存储介质以及电子设备 |
Family Cites Families (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5359465B2 (ja) | 2009-03-31 | 2013-12-04 | ソニー株式会社 | 固体撮像装置、固体撮像装置の信号処理方法および撮像装置 |
| CN104318603A (zh) * | 2014-09-12 | 2015-01-28 | 上海明穆电子科技有限公司 | 从手机相册调取照片生成3d模型的方法及系统 |
| CN107103590B (zh) * | 2017-03-22 | 2019-10-18 | 华南理工大学 | 一种基于深度卷积对抗生成网络的图像反射去除方法 |
| US10565758B2 (en) * | 2017-06-14 | 2020-02-18 | Adobe Inc. | Neural face editing with intrinsic image disentangling |
| CN107481188A (zh) * | 2017-06-23 | 2017-12-15 | 珠海经济特区远宏科技有限公司 | 一种图像超分辨率重构方法 |
| US11011275B2 (en) | 2018-02-12 | 2021-05-18 | Ai.Skopy, Inc. | System and method for diagnosing gastrointestinal neoplasm |
| US10825219B2 (en) * | 2018-03-22 | 2020-11-03 | Northeastern University | Segmentation guided image generation with adversarial networks |
| CN108520503B (zh) * | 2018-04-13 | 2020-12-22 | 湘潭大学 | 一种基于自编码器和生成对抗网络修复人脸缺损图像的方法 |
| US10284432B1 (en) | 2018-07-03 | 2019-05-07 | Kabushiki Kaisha Ubitus | Method for enhancing quality of media transmitted via network |
| CN109685724B (zh) * | 2018-11-13 | 2020-04-03 | 天津大学 | 一种基于深度学习的对称感知人脸图像补全方法 |
| CN109615582B (zh) * | 2018-11-30 | 2023-09-01 | 北京工业大学 | 一种基于属性描述生成对抗网络的人脸图像超分辨率重建方法 |
| CN109685072B (zh) * | 2018-12-22 | 2021-05-14 | 北京工业大学 | 一种基于生成对抗网络的复合降质图像高质量重建方法 |
| CN110349102B (zh) * | 2019-06-27 | 2025-10-10 | 腾讯科技(深圳)有限公司 | 图像美化的处理方法、图像美化的处理装置以及电子设备 |
| CN110363116B (zh) * | 2019-06-28 | 2021-07-23 | 上海交通大学 | 基于gld-gan的不规则人脸矫正方法、系统及介质 |
| CN110472566B (zh) * | 2019-08-14 | 2022-04-26 | 旭辉卓越健康信息科技有限公司 | 一种高精度的模糊人脸识别方法 |
| CN111080527B (zh) * | 2019-12-20 | 2023-12-05 | 北京金山云网络技术有限公司 | 一种图像超分辨率的方法、装置、电子设备及存储介质 |
| CN111126307B (zh) * | 2019-12-26 | 2023-12-12 | 东南大学 | 联合稀疏表示神经网络的小样本人脸识别方法 |
| CN111179177B (zh) * | 2019-12-31 | 2024-03-26 | 深圳市联合视觉创新科技有限公司 | 图像重建模型训练方法、图像重建方法、设备及介质 |
-
2020
- 2020-06-28 CN CN202010595618.2A patent/CN111488865B/zh active Active
-
2021
- 2021-05-26 JP JP2022552468A patent/JP7446457B2/ja active Active
- 2021-05-26 WO PCT/CN2021/096024 patent/WO2022001509A1/zh not_active Ceased
- 2021-05-26 EP EP21832144.6A patent/EP4050511B1/en active Active
-
2022
- 2022-05-03 US US17/735,948 patent/US12175640B2/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190114748A1 (en) * | 2017-10-16 | 2019-04-18 | Adobe Systems Incorporated | Digital Image Completion Using Deep Learning |
| CN108537743A (zh) * | 2018-03-13 | 2018-09-14 | 杭州电子科技大学 | 一种基于生成对抗网络的面部图像增强方法 |
| CN109376582A (zh) * | 2018-09-04 | 2019-02-22 | 电子科技大学 | 一种基于生成对抗网络的交互式人脸卡通方法 |
| CN111488865A (zh) * | 2020-06-28 | 2020-08-04 | 腾讯科技(深圳)有限公司 | 图像优化方法、装置、计算机存储介质以及电子设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4050511A1 (en) | 2022-08-31 |
| CN111488865A (zh) | 2020-08-04 |
| EP4050511B1 (en) | 2025-10-01 |
| JP7446457B2 (ja) | 2024-03-08 |
| US20220261968A1 (en) | 2022-08-18 |
| CN111488865B (zh) | 2020-10-27 |
| US12175640B2 (en) | 2024-12-24 |
| JP2023515654A (ja) | 2023-04-13 |
| EP4050511A4 (en) | 2022-12-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2022001509A1 (zh) | 图像优化方法、装置、计算机存储介质以及电子设备 | |
| EP3948764B1 (en) | Method and apparatus for training neural network model for enhancing image detail | |
| CN110163080B (zh) | 人脸关键点检测方法及装置、存储介质和电子设备 | |
| EP3916627A1 (en) | Living body detection method based on facial recognition, and electronic device and storage medium | |
| CN111767906B (zh) | 人脸检测模型训练方法、人脸检测方法、装置及电子设备 | |
| CN106682632B (zh) | 用于处理人脸图像的方法和装置 | |
| CN110516201A (zh) | 图像处理方法、装置、电子设备及存储介质 | |
| CN109117755B (zh) | 一种人脸活体检测方法、系统和设备 | |
| US20240404018A1 (en) | Image processing method and apparatus, device, storage medium and program product | |
| CN112581370A (zh) | 人脸图像的超分辨率重建模型的训练及重建方法 | |
| CN108388889B (zh) | 用于分析人脸图像的方法和装置 | |
| JP2019219928A (ja) | 画像処理装置、画像処理方法、及び画像処理プログラム | |
| CN116977548A (zh) | 三维重建方法、装置、设备及计算机可读存储介质 | |
| CN113436081B (zh) | 数据处理方法、图像增强方法及其模型训练方法 | |
| WO2020087434A1 (zh) | 一种人脸图像清晰度评价方法及装置 | |
| CN115222606A (zh) | 图像处理方法、装置、计算机可读介质及电子设备 | |
| CN117036179A (zh) | 图像处理方法、装置、存储介质及计算机设备 | |
| CN115375565B (zh) | 一种图像中特定形状噪声去除方法、装置及计算机设备 | |
| CN114387315B (zh) | 图像处理模型训练、图像处理方法、装置、设备及介质 | |
| CN117635838A (zh) | 三维人脸重建方法、设备、存储介质及装置 | |
| CN119646786B (zh) | 在线教育平台用户认证方法及系统 | |
| Yang et al. | An end‐to‐end perceptual enhancement method for UHD portrait images | |
| HK40027974A (zh) | 图像优化方法、装置、计算机存储介质以及电子设备 | |
| HK40027974B (zh) | 图像优化方法、装置、计算机存储介质以及电子设备 | |
| CN117152816B (zh) | 一种基于卷积神经网络的密集小型人脸检测方法及系统 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21832144 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021832144 Country of ref document: EP Effective date: 20220523 |
|
| ENP | Entry into the national phase |
Ref document number: 2022552468 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2021832144 Country of ref document: EP |
|
| WWG | Wipo information: grant in national office |
Ref document number: 202237036773 Country of ref document: IN |