CN108171663A - The image completion system for the convolutional neural networks that feature based figure arest neighbors is replaced - Google Patents

The image completion system for the convolutional neural networks that feature based figure arest neighbors is replaced Download PDF

Info

Publication number
CN108171663A
CN108171663A CN201711416650.4A CN201711416650A CN108171663A CN 108171663 A CN108171663 A CN 108171663A CN 201711416650 A CN201711416650 A CN 201711416650A CN 108171663 A CN108171663 A CN 108171663A
Authority
CN
China
Prior art keywords
image
convolutional layer
input object
filled
warp lamination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711416650.4A
Other languages
Chinese (zh)
Other versions
CN108171663B (en
Inventor
左旺孟
颜肇义
李晓明
山世光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Shenzhen
Original Assignee
Harbin Institute of Technology Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Shenzhen filed Critical Harbin Institute of Technology Shenzhen
Priority to CN201711416650.4A priority Critical patent/CN108171663B/en
Publication of CN108171663A publication Critical patent/CN108171663A/en
Application granted granted Critical
Publication of CN108171663B publication Critical patent/CN108171663B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

基于特征图最近邻替换的卷积神经网络的图像填充系统,属于图像填充技术领域,解决了现有图像填充方法无法快速地获得整体语义一致且具有良好清晰度的填充图像的问题。所述系统:生成网络对待填充图像先编码后解码,得到已填充图像。生成网络的解码器包括N个反卷积层,对于第一反卷积层~第N‑1反卷积层中的任意M个反卷积层,生成网络基于每个反卷积层的输出结果和该反卷积层对应的卷积层的输出结果,并采用特征图最近邻替换的方式得到附加特征图,并将每个反卷积层的输出结果、该反卷积层对应的卷积层的输出结果和附加特征图共同作为下一反卷积层的输入对象。判别网络用于判断已填充图像是否为待填充图像对应的真实图像。

The image filling system of the convolutional neural network based on the nearest neighbor replacement of the feature map belongs to the field of image filling technology, and solves the problem that the existing image filling method cannot quickly obtain the filling image with consistent overall semantics and good clarity. The system: the generating network encodes the image to be filled and then decodes it to obtain the filled image. The decoder of the generation network includes N deconvolution layers. For any M deconvolution layers in the first deconvolution layer to the N-1th deconvolution layer, the generation network is based on the output of each deconvolution layer The result and the output result of the convolutional layer corresponding to the deconvolution layer, and the feature map nearest neighbor replacement method is used to obtain the additional feature map, and the output result of each deconvolution layer, the convolution layer corresponding to the deconvolution layer The output of the convolution layer and the additional feature map are used as the input object of the next deconvolution layer. The discriminative network is used to judge whether the filled image is the real image corresponding to the image to be filled.

Description

基于特征图最近邻替换的卷积神经网络的图像填充系统Image Filling System Based on Convolutional Neural Networks with Feature Map Nearest Neighbor Replacement

技术领域technical field

本发明涉及一种图像填充系统,属于图像填充技术领域。The invention relates to an image filling system, which belongs to the technical field of image filling.

背景技术Background technique

图像填充是计算机视觉和图像处理领域中的一项基本问题,其主要用于对受到损坏的图像进行修复重建或者去除图像中的多余物体。Image filling is a basic problem in the field of computer vision and image processing, which is mainly used to repair and reconstruct damaged images or remove redundant objects in images.

现有的图像填充方法主要包括基于扩散的图像填充方法、基于样本的图像填充方法和基于深度学习的图像填充方法。Existing image filling methods mainly include diffusion-based image filling methods, sample-based image filling methods and deep learning-based image filling methods.

基于扩散的图像填充方法的基本思想为:以像素点为单位,将待填充区域边缘的图像信息扩散到待填充区域内部。当待填充区域面积较小且结构简单、纹理单一时,该图像填充方法能够较好地完成图像填充任务。然而,当待填充区域面积较大时,采用该图像填充方法获得的填充图像的清晰度较差。The basic idea of the diffusion-based image filling method is to diffuse the image information at the edge of the area to be filled to the inside of the area to be filled in units of pixels. When the area to be filled is small, the structure is simple, and the texture is single, the image filling method can better complete the image filling task. However, when the area to be filled is large, the clarity of the filled image obtained by using this image filling method is poor.

基于样本的图像填充方法的基本思想为:以图像块为单位,由图像已知区域向待填充区域逐渐填充。每次填充图像块时,用图像已知区域中与待填充区域边缘图像块最相似的图像块来填充待填充区域。与基于扩散的图像填充方法相比,采用基于样本的图像填充方法获得的填充图像的纹理更好,清晰度更高。然而,由于基于样本的图像填充方法是采用图像已知区域中的相似图像块来逐步替换待填充区域中的未知图像块,因此,采用该图像填充方法无法获得整体语义一致的填充图像。The basic idea of the sample-based image filling method is: take the image block as a unit, and gradually fill from the known area of the image to the area to be filled. Each time an image block is filled, the area to be filled is filled with the image block in the known area of the image that is most similar to the edge image block of the area to be filled. Compared with the diffusion-based image filling method, the sample-based image filling method obtains better texture and higher definition of the filled image. However, since the sample-based image filling method uses similar image blocks in the known area of the image to gradually replace the unknown image blocks in the area to be filled, it is impossible to obtain an overall semantically consistent filled image using this image filling method.

基于深度学习的图像填充方法主要是指将深度神经网络应用到图像填充领域中。目前,有学者提出采用编码器-解码器网络来对中间区域缺失的图像进行图像填充。然而,这种图像填充方法只适用于128*128的RGB图像。采用该图像填充方法获得的填充图像虽然能够满足整体语义一致的要求,但是,其填充图像的清晰度较差。针对这一问题,有学者尝试采用多尺度迭代更新的方式进行大图的清晰填充。然而,虽然这种图像填充方法所获得的填充图像具有整体语义一致性和良好的清晰度,但是,其速度极慢。在Titan X显卡运行环境下,对一张256*256的RGB图像进行填充,需要耗时数十秒至几分钟。The image filling method based on deep learning mainly refers to the application of deep neural network to the field of image filling. At present, some scholars have proposed to use the encoder-decoder network to fill in the missing image in the middle area. However, this image filling method is only suitable for 128*128 RGB images. Although the filled image obtained by this image filling method can meet the requirement of overall semantic consistency, the clarity of the filled image is poor. In response to this problem, some scholars try to clear fill large images by using multi-scale iterative updates. However, although the filled image obtained by this image filling method has overall semantic consistency and good clarity, its speed is extremely slow. Under the running environment of the Titan X graphics card, it takes tens of seconds to several minutes to fill a 256*256 RGB image.

发明内容Contents of the invention

本发明为解决现有的图像填充方法无法快速地获得整体语义一致且具有良好清晰度的填充图像的问题,提出了一种基于特征图最近邻替换的卷积神经网络的图像填充系统。In order to solve the problem that the existing image filling method cannot quickly obtain a filling image with consistent overall semantics and good definition, the present invention proposes an image filling system based on a convolutional neural network with feature map nearest neighbor replacement.

本发明所述的基于特征图最近邻替换的卷积神经网络的图像填充系统包括生成网络和判别网络;The image filling system of the convolutional neural network based on the feature map nearest neighbor replacement of the present invention includes a generation network and a discrimination network;

生成网络包括编码器和解码器,编码器包括N个卷积层,解码器包括N个反卷积层,N≥2;The generation network includes an encoder and a decoder, the encoder includes N convolutional layers, and the decoder includes N deconvolutional layers, N≥2;

生成网络通过对待填充图像先编码后解码的方式,得到已填充图像;The generation network obtains the filled image by first encoding and then decoding the image to be filled;

对于第一反卷积层~第N-1反卷积层中的任意M个反卷积层,生成网络基于每个反卷积层的输出结果和该反卷积层对应的卷积层的输出结果,并采用特征图最近邻替换的方式得到附加特征图,并将每个反卷积层的输出结果、该反卷积层对应的卷积层的输出结果和得到的附加特征图共同作为下一反卷积层的输入对象;For any M deconvolution layers in the first deconvolution layer to the N-1th deconvolution layer, the generation network is based on the output result of each deconvolution layer and the convolution layer corresponding to the deconvolution layer. Output the result, and use the feature map nearest neighbor replacement method to obtain an additional feature map, and use the output result of each deconvolution layer, the output result of the convolution layer corresponding to the deconvolution layer and the obtained additional feature map as The input object of the next deconvolution layer;

1≤M≤N-1;1≤M≤N-1;

判别网络用于判断已填充图像是否为待填充图像对应的真实图像,进而对生成网络的权重学习进行约束。The discriminative network is used to judge whether the filled image is the real image corresponding to the image to be filled, and then constrains the weight learning of the generating network.

作为优选的是,编码器包括卷积层E1~卷积层E8,解码器包括反卷积层D1~反卷积层D8Preferably, the encoder includes a convolutional layer E 1 to a convolutional layer E 8 , and the decoder includes a deconvolutional layer D 1 to a deconvolutional layer D 8 ;

待填充图像为卷积层E1的输入对象;The image to be filled is the input object of the convolutional layer E1 ;

对于卷积层E1~卷积层E8,前者的输出结果在依次经批规范化和Leaky ReLU函数激活后,作为后者的输入对象;For the convolutional layer E 1 ~ convolutional layer E 8 , the output result of the former is used as the input object of the latter after batch normalization and Leaky ReLU function activation in turn;

卷积层E8的输出结果在依次经批规范化和Leaky ReLU函数激活后,作为反卷积层D1的输入对象;The output result of the convolutional layer E 8 is used as the input object of the deconvolutional layer D 1 after being sequentially activated by batch normalization and Leaky ReLU function;

反卷积层D1的输出结果在经ReLU函数激活后作为反卷积层D2的第一输入对象;The output result of the deconvolution layer D1 is activated by the ReLU function as the first input object of the deconvolution layer D2 ;

对于反卷积层D2~反卷积层D8,前者的输出结果在依次经ReLU函数激活和批规范化后,作为后者的第一输入对象;For the deconvolution layer D 2 ~ deconvolution layer D 8 , the output of the former is the first input object of the latter after being activated by the ReLU function and batch normalized in turn;

反卷积层D2~反卷积层D8的第二输入对象依次为卷积层E7~卷积层E1的依次经批规范化和Leaky ReLU函数激活后的输出结果;The second input objects of deconvolution layer D 2 to deconvolution layer D 8 are the output results of convolution layer E 7 to convolution layer E 1 after batch normalization and Leaky ReLU function activation in sequence;

经Tanh函数激活后的反卷积层D8的输出结果为已填充图像;The output result of the deconvolution layer D 8 activated by the Tanh function is a filled image;

卷积层E1用于对输入对象进行64个4*4、步长为2的卷积操作;The convolutional layer E 1 is used to perform 64 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E2用于对输入对象进行128个4*4、步长为2的卷积操作;The convolutional layer E 2 is used to perform 128 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E3用于对输入对象进行256个4*4、步长为2的卷积操作;The convolutional layer E 3 is used to perform 256 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E4~卷积层E8均用于对输入对象进行512个4*4、步长为2的卷积操作;The convolutional layers E 4 to E 8 are all used to perform 512 convolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D1~反卷积层D4均用于对输入对象进行512个4*4、步长为2的反卷积操作;The deconvolution layer D 1 ~ deconvolution layer D 4 are all used to perform 512 deconvolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D5用于对输入对象进行256个4*4、步长为2的反卷积操作;The deconvolution layer D 5 is used to perform 256 4*4 deconvolution operations with a step size of 2 on the input object;

反卷积层D6用于对输入对象进行128个4*4、步长为2的反卷积操作;The deconvolution layer D 6 is used to perform 128 deconvolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D7用于对输入对象进行64个4*4、步长为2的反卷积操作;The deconvolution layer D 7 is used to perform 64 deconvolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D8用于对输入对象进行3个4*4、步长为2的反卷积操作;The deconvolution layer D 8 is used to perform three 4*4 deconvolution operations with a step size of 2 on the input object;

生成网络基于反卷积层D5的输出结果和卷积层E3的输出结果,并采用特征图最近邻替换的方式得到附加特征图,并将该附加特征图作为反卷积层D6的第三输入对象。The generation network is based on the output results of the deconvolution layer D5 and the output results of the convolution layer E3 , and adopts the method of feature map nearest neighbor replacement to obtain additional feature maps, and uses the additional feature maps as the output of the deconvolution layer D6 . The third input object.

作为优选的是,生成网络基于反卷积层D5的输出结果和卷积层E3的输出结果,并采用特征图最近邻替换的方式得到附加特征图的具体过程为:Preferably, the generation network is based on the output result of the deconvolution layer D5 and the output result of the convolution layer E3 , and the specific process of obtaining the additional feature map by means of feature map nearest neighbor replacement is as follows:

选取一个特征值均为0的待赋值特征图,该特征图与反卷积层D5的输出特征图和卷积层E3的输出特征图具有相等的通道数和相同的空间大小;Select a feature map to be assigned with a feature value of 0, which has the same number of channels and the same space size as the output feature map of the deconvolution layer D5 and the output feature map of the convolution layer E3 ;

计算得到反卷积层D5的输出特征图的掩膜区域和卷积层E3的输出特征图的非掩膜区域,并同时将所述掩膜区域和所述非掩膜区域切割为多个特征块;Calculate the masked area of the output feature map of the deconvolution layer D5 and the non-masked area of the output feature map of the convolutional layer E3 , and simultaneously cut the masked area and the non-masked area into multiple a feature block;

多个特征块均为长方体,其尺寸为C*h*w,其中,C、h和w分别为反卷积层D5的输出特征图的通道数、长方体的长度和长方体的宽度;A plurality of feature blocks are cuboids with a size of C*h*w, wherein C, h and w are respectively the number of channels of the output feature map of the deconvolution layer D5 , the length of the cuboid and the width of the cuboid;

对于所述掩膜区域中的每个特征块p1,选取所述非掩膜区域的多个特征块中与特征块p1距离最近的特征块p2For each feature block p 1 in the masked area, select the feature block p 2 closest to the feature block p 1 among the multiple feature blocks in the non-masked area;

选取待赋值特征图中的待赋值区域,该待赋值区域与特征块p1在反卷积层D5的输出特征图中的位置一致;Select the region to be assigned in the feature map to be assigned, the region to be assigned is consistent with the position of the feature block p1 in the output feature map of the deconvolution layer D5 ;

将特征块p2的特征值赋予所述待赋值区域。Assign the feature value of the feature block p2 to the area to be assigned.

作为优选的是,特征块p2与特征块p1的余弦距离最近。Preferably, the cosine distance between the feature block p 2 and the feature block p 1 is the shortest.

作为优选的是,输出特征图的掩膜区域和非掩膜区域的计算方式为:Preferably, the calculation method of the mask area and the non-mask area of the output feature map is:

给定一幅掩码图像来替代待填充图像,掩码图像与待填充图像的尺寸相同,通道数为1,特征值为0或1;Given a mask image to replace the image to be filled, the mask image has the same size as the image to be filled, the number of channels is 1, and the feature value is 0 or 1;

0表示该特征点在待填充图像上的相应位置为非待填充点;0 indicates that the corresponding position of the feature point on the image to be filled is not a point to be filled;

1表示该特征点在待填充图像上的相应位置为待填充点;1 indicates that the corresponding position of the feature point on the image to be filled is the point to be filled;

通过卷积网络来计算掩码图像的特征图的掩膜区域和非掩膜区域,该卷积网络包括第一卷积层~第三卷积层;calculating the mask area and non-mask area of the feature map of the mask image through a convolutional network, the convolutional network includes a first convolutional layer to a third convolutional layer;

掩码图像为第一卷积层的输入对象;The mask image is the input object of the first convolutional layer;

对于第一卷积层~第三卷积层,前者的输出结果为后者的输入对象;For the first convolutional layer to the third convolutional layer, the output of the former is the input object of the latter;

第一卷积层~第三卷积层均用于对输入对象进行1个4*4、步长为2的卷积操作;The first convolutional layer to the third convolutional layer are used to perform a 4*4 convolution operation with a step size of 2 on the input object;

第三卷积层的输出结果为掩码图像的特征图,其尺寸为32*32,通道为1;The output of the third convolutional layer is the feature map of the mask image, which has a size of 32*32 and a channel of 1;

对于掩码图像的特征图,当其一个特征值大于设定的阈值时,判定该特征点为掩膜点,否则,判定该特征点为非掩膜点;For the feature map of the mask image, when one of the feature values is greater than the set threshold, the feature point is determined to be a mask point, otherwise, the feature point is determined to be a non-mask point;

掩码图像的特征图的掩膜区域为掩膜点的集合,掩码图像的特征图的非掩膜区域为非掩膜点的集合;The mask area of the feature map of the mask image is a set of mask points, and the non-mask area of the feature map of the mask image is a set of non-mask points;

输出特征图的掩膜区域与掩码图像的特征图的掩膜区域相等,输出特征图的非掩膜区域与掩码图像的特征图的非掩膜区域相等。The masked area of the output feature map is equal to the masked area of the feature map of the mask image, and the unmasked area of the output feature map is equal to the non-masked area of the feature map of the mask image.

作为优选的是,生成网络采用引导损失约束的方式进行训练,引导损失约束的具体方式为在生成网络训练的过程中,在任意卷积层或反卷积层中对真实图像和输入图像进行特征相似约束;Preferably, the generation network is trained in a guided loss constraint, and the specific method of guiding the loss constraint is to perform feature extraction on the real image and the input image in any convolution layer or deconvolution layer during the training process of the generation network. similar constraints;

输入图像为经掩膜操作的真实图像。The input image is a masked real image.

作为优选的是,生成网络进行训练的具体方式为:Preferably, the specific way to generate the network for training is as follows:

将目标图像Igt输入至生成网络,计算第l层的特征图的掩膜区域,并得到(Φl(Igt))y信息;Input the target image I gt to the generation network, calculate the mask area of the feature map of the l-th layer, and obtain (Φ l (I gt )) y information;

将待填充图像I输入至生成网络,计算第L-l层的特征图的掩膜区域,并得到(ΦL-l(I))y信息;Input the image I to be filled into the generation network, calculate the mask area of the feature map of the L1 layer, and obtain (Φ L1 (I)) y information;

此时定义引导损失约束LgAt this point define the bootstrap loss constraint L g :

式中,Ω是掩模区域,L为生成网络的总层数,y为掩模区域内的任一坐标点,ΦL-l(I)为当输入对象为待填充图像时,生成网络在第L-l层输出的特征图,(ΦL-l(I))y为第L-l层的输出特征图的掩膜区域中y的信息,Φl(Igt)为输入对象为目标图像时,生成网络在第l层输出的特征图,(Φl(Igt))y为第l层的输出特征图的掩膜区域中y的信息。In the formula, Ω is the mask area, L is the total number of layers of the generated network, y is any coordinate point in the mask area, Φ Ll (I) is when the input object is an image to be filled, The feature map output by the layer, (Φ Ll (I)) y is the information of y in the mask area of the output feature map of the Ll layer, Φ l (I gt ) is when the input object is the target image, the generation network The feature map output by the layer, (Φ l (I gt )) y is the information of y in the mask area of the output feature map of the l-th layer.

作为优选的是,判别网络包括卷积层E9~卷积层E13Preferably, the discriminant network includes a convolutional layer E 9 to a convolutional layer E 13 ;

卷积层E9的输入对象为已填充图像;The input object of the convolutional layer E 9 is a filled image;

卷积层E9的输出结果经Leaky ReLU函数激活后,作为卷积层E10的输入对象;The output result of the convolutional layer E 9 is activated by the Leaky ReLU function as the input object of the convolutional layer E 10 ;

对于卷积层E10~卷积层E13,前者的输出结果依次经批规范化和Leaky ReLU函数激活后,作为后者的输入对象;For the convolutional layer E 10 ~ convolutional layer E 13 , the output result of the former is used as the input object of the latter after batch normalization and Leaky ReLU function activation in turn;

依次经批规范化和Sigmoid函数激活后的卷积层E13的输出结果为判别网络的输出结果;The output result of the convolutional layer E 13 after successive batch normalization and Sigmoid function activation is the output result of the discriminant network;

卷积层E9用于对输入对象进行64个4*4、步长为2的卷积操作;The convolutional layer E 9 is used to perform 64 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E10用于对输入对象进行128个4*4、步长为2的卷积操作;The convolutional layer E 10 is used to perform 128 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E11用于对输入对象进行256个4*4、步长为2的卷积操作;The convolutional layer E 11 is used to perform 256 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E12用于对输入对象进行512个4*4、步长为1的卷积操作;The convolutional layer E 12 is used to perform 512 4*4 convolution operations with a step size of 1 on the input object;

卷积层E13用于对输入对象进行1个4*4、步长为1的卷积操作。The convolutional layer E 13 is used to perform a 4*4 convolution operation with a step size of 1 on the input object.

作为优选的是,已填充图像为256*256的RGB图像,卷积层E13的输出结果的空间大小为64*64,通道为1。Preferably, the filled image is a 256*256 RGB image, the output result of the convolutional layer E13 has a space size of 64*64, and a channel of 1.

作为优选的是,所述图像填充系统采用Adam优化算法进行端对端的训练。Preferably, the image filling system uses Adam optimization algorithm for end-to-end training.

本发明所述的基于特征图最近邻替换的卷积神经网络的图像填充系统,将待填充图像作为其输入对象,通过在生成网络解码部分的中间输出进行特征图最近邻替换,使得一次前向传播即可得到具有整体语义一致性和良好清晰度的已填充图像。与现有的图像填充方法相比,所述图像填充系统因只需进行一次前向传播而能够更快速地获得已填充图像。The image filling system of the convolutional neural network based on the nearest neighbor replacement of the feature map according to the present invention takes the image to be filled as its input object, and performs the nearest neighbor replacement of the feature map through the intermediate output of the decoding part of the generation network, so that a forward Propagation results in a filled image with overall semantic consistency and good clarity. Compared with existing image filling methods, the image filling system can obtain filled images more quickly because it only needs to perform one forward pass.

附图说明Description of drawings

在下文中将基于实施例并参考附图来对本发明所述的基于特征图最近邻替换的卷积神经网络的图像填充系统进行更详细的描述,其中:In the following, the image filling system of the convolutional neural network based on feature map nearest neighbor replacement according to the present invention will be described in more detail based on the embodiments and with reference to the accompanying drawings, wherein:

图1为实施例提及的生成网络的结构框图;Fig. 1 is the structural block diagram of the generation network that embodiment mentions;

图2为实施例提及的判别网络的结构框图;Fig. 2 is the block diagram of the discriminant network mentioned in the embodiment;

图3为任意缺失的待填充图像;Figure 3 is any missing image to be filled;

图4为将任意缺失的待填充图像输入生成网络后得到的已填充图像;Figure 4 is the filled image obtained after inputting any missing image to be filled into the generation network;

图5为中心缺失的待填充图像;Figure 5 is an image to be filled with a missing center;

图6为将中心缺失的待填充图像输入生成网络后得到的已填充图像。Figure 6 shows the filled image obtained after inputting the image to be filled with missing center into the generation network.

具体实施方式Detailed ways

下面将结合附图对本发明所述的基于特征图最近邻替换的卷积神经网络的图像填充系统进一步说明。The image filling system based on the convolutional neural network with feature map nearest neighbor replacement according to the present invention will be further described below with reference to the accompanying drawings.

实施例:下面结合图1~图6详细地说明本实施例。Embodiment: The present embodiment will be described in detail below with reference to FIGS. 1 to 6 .

本实施例所述的基于特征图最近邻替换的卷积神经网络的图像填充系统包括生成网络和判别网络;The image filling system based on the convolutional neural network of feature map nearest neighbor replacement described in this embodiment includes a generation network and a discrimination network;

生成网络包括编码器和解码器,编码器包括N个卷积层,解码器包括N个反卷积层,N≥2;The generation network includes an encoder and a decoder, the encoder includes N convolutional layers, and the decoder includes N deconvolutional layers, N≥2;

生成网络通过对待填充图像先编码后解码的方式,得到已填充图像;The generation network obtains the filled image by first encoding and then decoding the image to be filled;

对于第一反卷积层~第N-1反卷积层中的任意M个反卷积层,生成网络基于每个反卷积层的输出结果和该反卷积层对应的卷积层的输出结果,并采用特征图最近邻替换的方式得到附加特征图,并将每个反卷积层的输出结果、该反卷积层对应的卷积层的输出结果和得到的附加特征图共同作为下一反卷积层的输入对象;For any M deconvolution layers in the first deconvolution layer to the N-1th deconvolution layer, the generation network is based on the output result of each deconvolution layer and the convolution layer corresponding to the deconvolution layer. Output the result, and use the feature map nearest neighbor replacement method to obtain an additional feature map, and use the output result of each deconvolution layer, the output result of the convolution layer corresponding to the deconvolution layer and the obtained additional feature map as The input object of the next deconvolution layer;

1≤M≤N-1;1≤M≤N-1;

判别网络用于判断已填充图像是否为待填充图像对应的真实图像,进而对生成网络的权重学习进行约束。The discriminative network is used to judge whether the filled image is the real image corresponding to the image to be filled, and then constrains the weight learning of the generating network.

本实施例的编码器包括卷积层E1~卷积层E8,解码器包括反卷积层D1~反卷积层D8The encoder in this embodiment includes a convolutional layer E 1 to a convolutional layer E 8 , and the decoder includes a deconvolutional layer D 1 to a deconvolutional layer D 8 ;

待填充图像为卷积层E1的输入对象;The image to be filled is the input object of the convolutional layer E1 ;

对于卷积层E1~卷积层E8,前者的输出结果在依次经批规范化和Leaky ReLU函数激活后,作为后者的输入对象;For the convolutional layer E 1 ~ convolutional layer E 8 , the output result of the former is used as the input object of the latter after batch normalization and Leaky ReLU function activation in turn;

卷积层E8的输出结果在依次经批规范化和Leaky ReLU函数激活后,作为反卷积层D1的输入对象;The output result of the convolutional layer E 8 is used as the input object of the deconvolutional layer D 1 after being sequentially activated by batch normalization and Leaky ReLU function;

反卷积层D1的输出结果在经ReLU函数激活后作为反卷积层D2的第一输入对象;The output result of the deconvolution layer D1 is activated by the ReLU function as the first input object of the deconvolution layer D2 ;

对于反卷积层D2~反卷积层D8,前者的输出结果在依次经ReLU函数激活和批规范化后,作为后者的第一输入对象;For the deconvolution layer D 2 ~ deconvolution layer D 8 , the output of the former is the first input object of the latter after being activated by the ReLU function and batch normalized in turn;

反卷积层D2~反卷积层D8的第二输入对象依次为卷积层E7~卷积层E1的依次经批规范化和Leaky ReLU函数激活后的输出结果;The second input objects of deconvolution layer D 2 to deconvolution layer D 8 are the output results of convolution layer E 7 to convolution layer E 1 after batch normalization and Leaky ReLU function activation in sequence;

经Tanh函数激活后的反卷积层D8的输出结果为已填充图像;The output result of the deconvolution layer D 8 activated by the Tanh function is a filled image;

卷积层E1用于对输入对象进行64个4*4、步长为2的卷积操作;The convolutional layer E 1 is used to perform 64 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E2用于对输入对象进行128个4*4、步长为2的卷积操作;The convolutional layer E 2 is used to perform 128 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E3用于对输入对象进行256个4*4、步长为2的卷积操作;The convolutional layer E 3 is used to perform 256 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E4~卷积层E8均用于对输入对象进行512个4*4、步长为2的卷积操作;The convolutional layers E 4 to E 8 are all used to perform 512 convolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D1~反卷积层D4均用于对输入对象进行512个4*4、步长为2的反卷积操作;The deconvolution layer D 1 ~ deconvolution layer D 4 are all used to perform 512 deconvolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D5用于对输入对象进行256个4*4、步长为2的反卷积操作;The deconvolution layer D 5 is used to perform 256 4*4 deconvolution operations with a step size of 2 on the input object;

反卷积层D6用于对输入对象进行128个4*4、步长为2的反卷积操作;The deconvolution layer D 6 is used to perform 128 deconvolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D7用于对输入对象进行64个4*4、步长为2的反卷积操作;The deconvolution layer D 7 is used to perform 64 deconvolution operations of 4*4 with a step size of 2 on the input object;

反卷积层D8用于对输入对象进行3个4*4、步长为2的反卷积操作;The deconvolution layer D 8 is used to perform three 4*4 deconvolution operations with a step size of 2 on the input object;

生成网络基于反卷积层D5的输出结果和卷积层E3的输出结果,并采用特征图最近邻替换的方式得到附加特征图,并将该附加特征图作为反卷积层D6的第三输入对象。The generation network is based on the output results of the deconvolution layer D5 and the output results of the convolution layer E3 , and adopts the method of feature map nearest neighbor replacement to obtain additional feature maps, and uses the additional feature maps as the output of the deconvolution layer D6 . The third input object.

本实施例的生成网络基于反卷积层D5的输出结果和卷积层E3的输出结果,并采用特征图最近邻替换的方式得到附加特征图的具体过程为:The generation network of this embodiment is based on the output result of the deconvolution layer D5 and the output result of the convolution layer E3 , and the specific process of obtaining additional feature maps by means of feature map nearest neighbor replacement is as follows:

选取一个特征值均为0的待赋值特征图,该特征图与反卷积层D5的输出特征图和卷积层E3的输出特征图具有相等的通道数和相同的空间大小;Select a feature map to be assigned with a feature value of 0, which has the same number of channels and the same space size as the output feature map of the deconvolution layer D5 and the output feature map of the convolution layer E3 ;

计算得到反卷积层D5的输出特征图的掩膜区域和卷积层E3的输出特征图的非掩膜区域,并同时将所述掩膜区域和所述非掩膜区域切割为多个特征块;Calculate the masked area of the output feature map of the deconvolution layer D5 and the non-masked area of the output feature map of the convolutional layer E3 , and simultaneously cut the masked area and the non-masked area into multiple a feature block;

多个特征块均为长方体,其尺寸为C*h*w,其中,C、h和w分别为反卷积层D5的输出特征图的通道数、长方体的长度和长方体的宽度;A plurality of feature blocks are cuboids with a size of C*h*w, wherein C, h and w are respectively the number of channels of the output feature map of the deconvolution layer D5 , the length of the cuboid and the width of the cuboid;

对于所述掩膜区域中的每个特征块p1,选取所述非掩膜区域的多个特征块中与特征块p1距离最近的特征块p2For each feature block p 1 in the masked area, select the feature block p 2 closest to the feature block p 1 among the multiple feature blocks in the non-masked area;

选取待赋值特征图中的待赋值区域,该待赋值区域与特征块p1在反卷积层D5的输出特征图中的位置一致;Select the region to be assigned in the feature map to be assigned, the region to be assigned is consistent with the position of the feature block p1 in the output feature map of the deconvolution layer D5 ;

将特征块p2的特征值赋予所述待赋值区域。Assign the feature value of the feature block p2 to the area to be assigned.

输出特征图的掩膜区域和非掩膜区域的计算方式为:The calculation method of the masked area and non-masked area of the output feature map is:

给定一幅掩码图像来替代待填充图像,掩码图像与待填充图像的尺寸相同,通道数为1,特征值为0或1;Given a mask image to replace the image to be filled, the mask image has the same size as the image to be filled, the number of channels is 1, and the feature value is 0 or 1;

0表示该特征点在待填充图像上的相应位置为非待填充点;0 indicates that the corresponding position of the feature point on the image to be filled is not a point to be filled;

1表示该特征点在待填充图像上的相应位置为待填充点;1 indicates that the corresponding position of the feature point on the image to be filled is the point to be filled;

通过卷积网络来计算掩码图像的特征图的掩膜区域和非掩膜区域,该卷积网络包括第一卷积层~第三卷积层;calculating the mask area and non-mask area of the feature map of the mask image through a convolutional network, the convolutional network includes a first convolutional layer to a third convolutional layer;

掩码图像为第一卷积层的输入对象;The mask image is the input object of the first convolutional layer;

对于第一卷积层~第三卷积层,前者的输出结果为后者的输入对象;For the first convolutional layer to the third convolutional layer, the output of the former is the input object of the latter;

第一卷积层~第三卷积层均用于对输入对象进行1个4*4、步长为2的卷积操作;The first convolutional layer to the third convolutional layer are used to perform a 4*4 convolution operation with a step size of 2 on the input object;

第三卷积层的输出结果为掩码图像的特征图,其尺寸为32*32,通道为1;The output of the third convolutional layer is the feature map of the mask image, which has a size of 32*32 and a channel of 1;

对于掩码图像的特征图,当其一个特征值大于设定的阈值时,判定该特征点为掩膜点,否则,判定该特征点为非掩膜点;For the feature map of the mask image, when one of the feature values is greater than the set threshold, the feature point is determined to be a mask point, otherwise, the feature point is determined to be a non-mask point;

掩码图像的特征图的掩膜区域为掩膜点的集合,掩码图像的特征图的非掩膜区域为非掩膜点的集合;The mask area of the feature map of the mask image is a set of mask points, and the non-mask area of the feature map of the mask image is a set of non-mask points;

输出特征图的掩膜区域与掩码图像的特征图的掩膜区域相等,输出特征图的非掩膜区域与掩码图像的特征图的非掩膜区域相等。The masked area of the output feature map is equal to the masked area of the feature map of the mask image, and the unmasked area of the output feature map is equal to the non-masked area of the feature map of the mask image.

本实施例的生成网络采用引导损失约束的方式进行训练,引导损失约束的具体方式为在生成网络训练的过程中,在任意卷积层或反卷积层中对真实图像和输入图像进行特征相似约束;The generation network of this embodiment is trained in a guided loss constraint. The specific method of the guidance loss constraint is to perform feature similarity between the real image and the input image in any convolution layer or deconvolution layer during the generation network training process. constraint;

输入图像为经掩膜操作的真实图像。The input image is a masked real image.

本实施例的生成网络进行训练的具体方式为:The specific way for the generation network of this embodiment to train is as follows:

将目标图像Igt输入至生成网络,计算第l层的特征图的掩膜区域,并得到(Φl(Igt))y信息;Input the target image Igt to the generation network, calculate the mask area of the feature map of the l-th layer, and obtain (Φ l (I gt )) y information;

将待填充图像I输入至生成网络,计算第L-l层的特征图的掩膜区域,并得到(ΦL-l(I))y信息;Input the image I to be filled into the generation network, calculate the mask area of the feature map of the L1 layer, and obtain (Φ L1 (I)) y information;

此时定义引导损失约束LgAt this point define the bootstrap loss constraint L g :

式中,Ω是掩模区域,L为生成网络的总层数,y为掩模区域内的任一坐标点,ΦL-l(I)为当输入对象为待填充图像时,生成网络在第L-l层输出的特征图,(ΦL-l(I))y为第L-l层的输出特征图的掩膜区域中y的信息,Φl(Igt)为输入对象为目标图像时,生成网络在第l层输出的特征图,(Φl(Igt))y为第l层的输出特征图的掩膜区域中y的信息。In the formula, Ω is the mask area, L is the total number of layers of the generated network, y is any coordinate point in the mask area, Φ Ll (I) is when the input object is an image to be filled, The feature map output by the layer, (Φ Ll (I)) y is the information of y in the mask area of the output feature map of the Ll layer, Φ l (I gt ) is when the input object is the target image, the generation network The feature map output by the layer, (Φ l (I gt )) y is the information of y in the mask area of the output feature map of the l-th layer.

此外,待填充图像I经过生成网络得到图像记为Φ(I;W),W是生成网络模型的参数。定义重建损失 In addition, the image I to be filled is denoted as Φ(I; W) through the generative network, and W is the parameter of the generative network model. Define reconstruction loss

对于每个(ΦL-l(I))y,其与(Φl(I))x的距离计算如下:For each (Φ Ll (I)) y , its distance from (Φ l (I)) x is calculated as follows:

x为非掩模区域内的任一坐标点,(Φl(I))x为第l层的输出特征图的非掩膜区域中x的信息,是非掩膜区域。x is any coordinate point in the non-mask area, (Φ l (I)) x is the information of x in the non-mask area of the output feature map of the l-th layer, is the unmasked area.

其中距离度量公式如下:The distance metric formula is as follows:

找到最近点x*(y)后,用x*(y)替换区域中的与y在同一平面位置的为待输入下一反卷积层的附加特征图。After finding the closest point x * (y), replace with x * (y) in the same plane as y in the area is the additional feature map to be input to the next deconvolution layer.

即有:That is:

本实施例的判别网络包括卷积层E9~卷积层E13The discriminant network in this embodiment includes convolutional layers E 9 to convolutional layers E 13 ;

卷积层E9的输入对象为已填充图像;The input object of the convolutional layer E 9 is a filled image;

卷积层E9的输出结果经Leaky ReLU函数激活后,作为卷积层E10的输入对象;The output result of the convolutional layer E 9 is activated by the Leaky ReLU function as the input object of the convolutional layer E 10 ;

对于卷积层E10~卷积层E13,前者的输出结果依次经批规范化和Leaky ReLU函数激活后,作为后者的输入对象;For the convolutional layer E 10 ~ convolutional layer E 13 , the output result of the former is used as the input object of the latter after batch normalization and Leaky ReLU function activation in turn;

依次经批规范化和Sigmoid函数激活后的卷积层E13的输出结果为判别网络的输出结果;The output result of the convolutional layer E 13 after successive batch normalization and Sigmoid function activation is the output result of the discriminant network;

卷积层E9用于对输入对象进行64个4*4、步长为2的卷积操作;The convolutional layer E 9 is used to perform 64 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E10用于对输入对象进行128个4*4、步长为2的卷积操作;The convolutional layer E 10 is used to perform 128 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E11用于对输入对象进行256个4*4、步长为2的卷积操作;The convolutional layer E 11 is used to perform 256 convolution operations of 4*4 with a step size of 2 on the input object;

卷积层E12用于对输入对象进行512个4*4、步长为1的卷积操作;The convolutional layer E 12 is used to perform 512 4*4 convolution operations with a step size of 1 on the input object;

卷积层E13用于对输入对象进行1个4*4、步长为1的卷积操作。The convolutional layer E 13 is used to perform a 4*4 convolution operation with a step size of 1 on the input object.

已填充图像为256*256的RGB图像,卷积层E13的输出结果的空间大小为64*64,通道为1。The filled image is a 256*256 RGB image, the output result of the convolutional layer E 13 has a spatial size of 64*64 and a channel of 1.

判别网络输入是生成网络的输出的Φ(I;W)或是Igt,生成网络与判别网络进行对抗训练,此时产生对抗损失LadvThe input of the discriminant network is Φ(I; W) or I gt of the output of the generator network, and the generator network and the discriminant network are trained against each other, and the confrontation loss L adv is generated at this time:

式中,pdata(Igt)为真实图像的分布,pmiss(I)为输入图像的分布,D(·)表示判别网络对输入进判别网络的图像来自pdata(Igt)的概率预测,log为对数函数,Igt为目标图像,I为待填充图像。In the formula, p data (I gt ) is the distribution of the real image, p miss (I) is the distribution of the input image, and D( ) represents the probability prediction of the discriminant network for the image input into the discriminant network from p data (I gt ) , log is a logarithmic function, I gt is the target image, and I is the image to be filled.

因此,训练生成网络时,总损失为L:Therefore, when training the generative network, the total loss is L:

其中λg和λadv都是超参数。where λg and λadv are both hyperparameters.

图3为任意缺失的待填充图像,图4为将任意缺失的待填充图像输入生成网络后得到的已填充图像。将图3与图4对比可知:本实施例所述的基于特征图最近邻替换的卷积神经网络的图像填充系统适用于对任意缺失的待填充图像进行填充,且能够获得较好的填充效果。Figure 3 shows any missing image to be filled, and Figure 4 shows the filled image obtained after inputting any missing image to be filled into the generation network. Comparing Figure 3 with Figure 4, it can be seen that the image filling system based on the convolutional neural network based on the feature map nearest neighbor replacement described in this embodiment is suitable for filling any missing image to be filled, and can obtain a better filling effect .

图5为中心缺失的待填充图像,图6为将中心缺失的待填充图像输入生成网络后得到的已填充图像。将图5与图6对比可知:本实施例所述的基于特征图最近邻替换的卷积神经网络的图像填充系统适用于对中心缺失的待填充图像进行填充,且能够获得较好的填充效果。Figure 5 is the image to be filled with missing center, and Figure 6 is the filled image obtained after inputting the image to be filled with missing center into the generation network. Comparing Figure 5 with Figure 6, it can be seen that the image filling system based on the convolutional neural network based on the nearest neighbor replacement of the feature map described in this embodiment is suitable for filling the image to be filled with a missing center, and can obtain a better filling effect .

经仿真实验,本实施例所述的基于特征图最近邻替换的卷积神经网络的图像填充系统对一张256*256的RGB图像,耗时80ms左右。与现有图像填充方法耗时数十秒至几分钟相比,本实施例的图像填充系统在填充速度方面的提升十分显著。Through simulation experiments, the image filling system based on the convolutional neural network with feature map nearest neighbor replacement described in this embodiment takes about 80 ms for a 256*256 RGB image. Compared with the existing image filling method, which takes tens of seconds to several minutes, the image filling system of this embodiment has a significant improvement in filling speed.

本实施例所述的基于特征图最近邻替换的卷积神经网络的图像填充系统采用Adam优化算法进行端对端的训练。The image filling system of the convolutional neural network based on the feature map nearest neighbor replacement described in this embodiment uses the Adam optimization algorithm for end-to-end training.

虽然在本文中参照了特定的实施方式来描述本发明,但是应该理解的是,这些实施例仅是本发明的原理和应用的示例。因此应该理解的是,可以对示例性的实施例进行许多修改,并且可以设计出其他的布置,只要不偏离所附权利要求所限定的本发明的精神和范围。应该理解的是,可以通过不同于原始权利要求所描述的方式来结合不同的从属权利要求和本文中所述的特征。还可以理解的是,结合单独实施例所描述的特征可以使用在其他所述实施例中。Although the invention is described herein with reference to specific embodiments, it should be understood that these embodiments are merely illustrative of the principles and applications of the invention. It is therefore to be understood that numerous modifications may be made to the exemplary embodiments and that other arrangements may be devised without departing from the spirit and scope of the invention as defined by the appended claims. It shall be understood that different dependent claims and features described herein may be combined in a different way than that described in the original claims. It will also be appreciated that features described in connection with individual embodiments can be used in other described embodiments.

Claims (10)

1. the image completion system for the convolutional neural networks that feature based figure arest neighbors is replaced, which is characterized in that described image is filled out Charging system includes generation network and differentiates network;
Generate network include encoder and decoder, encoder include N number of convolutional layer, decoder include N number of warp lamination, N >= 2;
Generation network fills decoded mode after image first encodes by treating, and obtains having been filled with image;
For the arbitrary M warp lamination in the first warp lamination~N-1 warp laminations, generation network is based on each deconvolution The output of the output result convolutional layer corresponding with the warp lamination of layer by the way of the replacement of characteristic pattern arest neighbors as a result, and obtained To supplementary features figure, and by the output result of each warp lamination, the output result of the corresponding convolutional layer of warp lamination and obtain The supplementary features figure arrived is collectively as the input object of next warp lamination;
Network is differentiated for judging to have been filled with whether image is the corresponding true picture of image to be filled, and then to generation network Weight study is constrained.
2. the image completion system for the convolutional neural networks that feature based figure arest neighbors as described in claim 1 is replaced, special Sign is that encoder includes convolutional layer E1~convolutional layer E8, decoder include warp lamination D1~warp lamination D8
Image to be filled is convolutional layer E1Input object;
For convolutional layer E1~convolutional layer E8, the former output result is successively through batch standardization and the activation of Leaky ReLU functions Afterwards, as the input object of the latter;
Convolutional layer E8Output result successively through batch standardization and Leaky ReLU functions activation after, as warp lamination D1's Input object;
Warp lamination D1Output result through ReLU functions activation after be used as warp lamination D2The first input object;
For warp lamination D2~warp lamination D8, the former output result successively through ReLU functions activate and batch standardization after, The first input object as the latter;
Warp lamination D2~warp lamination D8The second input object be followed successively by convolutional layer E7~convolutional layer E1Successively through batch specification Change and the output result after the activation of Leaky ReLU functions;
Warp lamination D after the activation of Tanh functions8Output result to have been filled with image;
Convolutional layer E1For the convolution operation for carrying out 64 4*4 to input object, step-length is 2;
Convolutional layer E2For the convolution operation for carrying out 128 4*4 to input object, step-length is 2;
Convolutional layer E3For the convolution operation for carrying out 256 4*4 to input object, step-length is 2;
Convolutional layer E4~convolutional layer E8It is used to the convolution operation for 512 4*4 being carried out to input object, step-length is 2;
Warp lamination D1~warp lamination D4The deconvolution for being used to carry out 512 4*4 to input object, step-length is 2 operates;
Warp lamination D5For carrying out 256 4*4 to input object, step-length is operated for 2 deconvolution;
Warp lamination D6For carrying out 128 4*4 to input object, step-length is operated for 2 deconvolution;
Warp lamination D7For carrying out 64 4*4 to input object, step-length is operated for 2 deconvolution;
Warp lamination D8For carrying out 3 4*4 to input object, step-length is operated for 2 deconvolution;
It generates network and is based on warp lamination D5Output result and convolutional layer E3Output as a result, and being replaced using characteristic pattern arest neighbors The mode changed obtains supplementary features figure, and using the supplementary features figure as warp lamination D6Third input object.
3. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 2 is replaced, special Sign is that generation network is based on warp lamination D5Output result and convolutional layer E3Output as a result, and use characteristic pattern arest neighbors The detailed process that the mode of replacement obtains supplementary features figure is:
It is 0 to treat assignment characteristic pattern to choose a characteristic value, this feature figure and warp lamination D5Output characteristic pattern and convolutional layer E3Output characteristic pattern have equal port number and identical space size;
Warp lamination D is calculated5The output masked areas of characteristic pattern and convolutional layer E3Output characteristic pattern unmasked areas Domain, and the masked areas and the unmasked areas are cut into multiple characteristic blocks simultaneously;
Multiple characteristic blocks are cuboid, size C*h*w, wherein, C, h and w are respectively warp lamination D5Output characteristic pattern Port number, the length of cuboid and the width of cuboid;
For each characteristic block p in the masked areas1, choose in multiple characteristic blocks of the unmasked areas with characteristic block p1Closest characteristic block p2
It chooses and treats to treat assignment region in assignment characteristic pattern, this treats assignment region and characteristic block p1It is special in the output of warp lamination D5 Levy the position consistency in figure;
By characteristic block p2Characteristic value assign described in treat assignment region.
4. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 3 is replaced, special Sign is, characteristic block p2With characteristic block p1COS distance it is nearest.
5. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 4 is replaced, special Sign is that the calculation of masked areas and unmasked areas for exporting characteristic pattern is:
A width mask image is given to substitute image to be filled, mask image is identical with the size of image to be filled, and port number is 1, characteristic value is 0 or 1;
0 represents that corresponding position of this feature point on image to be filled is non-point to be filled;
1 represents that corresponding position of this feature point on image to be filled is point to be filled;
The masked areas of the characteristic pattern of mask image and unmasked areas are calculated by convolutional network, which includes the One convolutional layer~third convolutional layer;
Mask image is the input object of the first convolutional layer;
For the first convolutional layer~third convolutional layer, the former output result is the input object of the latter;
First convolutional layer~third convolutional layer is used to the convolution operation for carrying out 1 4*4 to input object, step-length is 2;
Characteristic pattern of the output result of third convolutional layer for mask image, size 32*32, channel 1;
For the characteristic pattern of mask image, when one characteristic value is more than the threshold value of setting, judgement this feature point is mask point, Otherwise, it is determined that this feature point is unmasked point;
The masked areas of the characteristic pattern of mask image is the set of mask point, and the unmasked areas of the characteristic pattern of mask image is non- The set of mask point;
The masked areas for exporting characteristic pattern is equal with the masked areas of the characteristic pattern of mask image, exports the unmasked areas of characteristic pattern Domain is equal with the unmasked areas of the characteristic pattern of mask image.
6. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 5 is replaced, special Sign is that generation network is trained by the way of Loss constraint is guided, and guides the concrete mode of Loss constraint to generate It is similar about to true picture and input picture progress feature in arbitrary convolutional layer or warp lamination during network training Beam;
Input picture is the true picture through masking operations.
7. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 6 is replaced, special Sign is that the concrete mode that generation network is trained is:
By target image IgtGeneration network is input to, calculates the masked areas of l layers of characteristic pattern, and obtains (Φl(Igt))yLetter Breath;
Image I to be filled is input to generation network, calculates the masked areas of L-l layers of characteristic pattern, and obtains (ΦL-l(I))y Information;
Definition guiding Loss constraint L at this timeg
In formula, Ω is masks area, and L makes a living into total number of plies of network, and y is any coordinate points in masks area, ΦL-l(I) it is When input object is image to be filled, the characteristic pattern of generation network output at L-l layers, (ΦL-l(I))yIt is L-l layers Export the information of y in the masked areas of characteristic pattern, Φl(Igt) when to be input object be target image, generation network is defeated at l layers The characteristic pattern gone out, (Φl(Igt))yInformation for y in the masked areas of l layers of output characteristic pattern.
8. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 7 is replaced, special Sign is, differentiates that network includes convolutional layer E9~convolutional layer E13
Convolutional layer E9Input object to have been filled with image;
Convolutional layer E9Output result through Leaky ReLU functions activation after, as convolutional layer E10Input object;
For convolutional layer E10~convolutional layer E13, the former output result is successively through batch standardization and the activation of Leaky ReLU functions Afterwards, as the input object of the latter;
Convolutional layer E after batch standardization and the activation of Sigmoid functions successively13Output result be differentiate network output knot Fruit;
Convolutional layer E9For the convolution operation for carrying out 64 4*4 to input object, step-length is 2;
Convolutional layer E10For the convolution operation for carrying out 128 4*4 to input object, step-length is 2;
Convolutional layer E11For the convolution operation for carrying out 256 4*4 to input object, step-length is 2;
Convolutional layer E12For the convolution operation for carrying out 512 4*4 to input object, step-length is 1;
Convolutional layer E13For the convolution operation for carrying out 1 4*4 to input object, step-length is 1.
9. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 8 is replaced, special Sign is, has been filled with the RGB image that image is 256*256, convolutional layer E13Output result space size for 64*64, channel It is 1.
10. the image completion system for the convolutional neural networks that feature based figure arest neighbors as claimed in claim 9 is replaced, special Sign is that described image fill system carries out end-to-end training using Adam optimization algorithms.
CN201711416650.4A 2017-12-22 2017-12-22 Image Filling System Based on Feature Map Nearest Neighbor Replacement with Convolutional Neural Networks Active CN108171663B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711416650.4A CN108171663B (en) 2017-12-22 2017-12-22 Image Filling System Based on Feature Map Nearest Neighbor Replacement with Convolutional Neural Networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711416650.4A CN108171663B (en) 2017-12-22 2017-12-22 Image Filling System Based on Feature Map Nearest Neighbor Replacement with Convolutional Neural Networks

Publications (2)

Publication Number Publication Date
CN108171663A true CN108171663A (en) 2018-06-15
CN108171663B CN108171663B (en) 2021-05-25

Family

ID=62520202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711416650.4A Active CN108171663B (en) 2017-12-22 2017-12-22 Image Filling System Based on Feature Map Nearest Neighbor Replacement with Convolutional Neural Networks

Country Status (1)

Country Link
CN (1) CN108171663B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108898647A (en) * 2018-06-27 2018-11-27 Oppo(重庆)智能科技有限公司 Image processing method, device, mobile terminal and storage medium
CN109087375A (en) * 2018-06-22 2018-12-25 华东师范大学 Image cavity fill method based on deep learning
CN109300128A (en) * 2018-09-29 2019-02-01 聚时科技(上海)有限公司 The transfer learning image processing method of structure is implied based on convolutional Neural net
JP2020005202A (en) * 2018-06-29 2020-01-09 日本放送協会 Video processing device
CN111242874A (en) * 2020-02-11 2020-06-05 北京百度网讯科技有限公司 Image restoration method and device, electronic equipment and storage medium
CN111614974A (en) * 2020-04-07 2020-09-01 上海推乐信息技术服务有限公司 Video image restoration method and system
CN112184566A (en) * 2020-08-27 2021-01-05 北京大学 An image processing method and system for removing attached water mist and water droplets
WO2021003936A1 (en) * 2019-07-05 2021-01-14 平安科技(深圳)有限公司 Image segmentation method, electronic device, and computer-readable storage medium
CN112997479A (en) * 2018-11-15 2021-06-18 Oppo广东移动通信有限公司 Method, system and computer readable medium for processing images across a phase jump connection
CN113330480A (en) * 2019-02-11 2021-08-31 康蒂-特米克微电子有限公司 Modular image restoration method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104025588A (en) * 2011-10-28 2014-09-03 三星电子株式会社 Method and device for intra prediction of video
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
CN107133934A (en) * 2017-05-18 2017-09-05 北京小米移动软件有限公司 Image completion method and device
US20170365038A1 (en) * 2016-06-16 2017-12-21 Facebook, Inc. Producing Higher-Quality Samples Of Natural Images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104025588A (en) * 2011-10-28 2014-09-03 三星电子株式会社 Method and device for intra prediction of video
US20170365038A1 (en) * 2016-06-16 2017-12-21 Facebook, Inc. Producing Higher-Quality Samples Of Natural Images
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
CN107133934A (en) * 2017-05-18 2017-09-05 北京小米移动软件有限公司 Image completion method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐一峰: "生成对抗网络理论模型和应用综述", 《金华职业技术学院学报》 *
李策等: "生成对抗映射网络下的图像多层感知去雾算法", 《计算机辅助设计与图形学学报》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109087375A (en) * 2018-06-22 2018-12-25 华东师范大学 Image cavity fill method based on deep learning
CN109087375B (en) * 2018-06-22 2023-06-23 华东师范大学 Image Hole Filling Method Based on Deep Learning
CN108898647A (en) * 2018-06-27 2018-11-27 Oppo(重庆)智能科技有限公司 Image processing method, device, mobile terminal and storage medium
JP2020005202A (en) * 2018-06-29 2020-01-09 日本放送協会 Video processing device
JP7202087B2 (en) 2018-06-29 2023-01-11 日本放送協会 Video processing device
CN109300128B (en) * 2018-09-29 2022-08-26 聚时科技(上海)有限公司 Transfer learning image processing method based on convolution neural network hidden structure
CN109300128A (en) * 2018-09-29 2019-02-01 聚时科技(上海)有限公司 The transfer learning image processing method of structure is implied based on convolutional Neural net
CN112997479A (en) * 2018-11-15 2021-06-18 Oppo广东移动通信有限公司 Method, system and computer readable medium for processing images across a phase jump connection
CN112997479B (en) * 2018-11-15 2022-11-11 Oppo广东移动通信有限公司 Method, system, and computer-readable medium for processing images across stage skip connections
JP2024041895A (en) * 2019-02-11 2024-03-27 コンティ テミック マイクロエレクトロニック ゲゼルシャフト ミット ベシュレンクテル ハフツング Modular image interpolation method
CN113330480A (en) * 2019-02-11 2021-08-31 康蒂-特米克微电子有限公司 Modular image restoration method
US11961215B2 (en) 2019-02-11 2024-04-16 Conti Temic Microelectronic Gmbh Modular inpainting method
JP2022517849A (en) * 2019-02-11 2022-03-10 コンティ テミック マイクロエレクトロニック ゲゼルシャフト ミット ベシュレンクテル ハフツング Modular image interpolation method
JP7808135B2 (en) 2019-02-11 2026-01-28 コンティ テミック マイクロエレクトロニック ゲゼルシャフト ミット ベシュレンクテル ハフツング Modular image interpolation method
WO2021003936A1 (en) * 2019-07-05 2021-01-14 平安科技(深圳)有限公司 Image segmentation method, electronic device, and computer-readable storage medium
CN111242874A (en) * 2020-02-11 2020-06-05 北京百度网讯科技有限公司 Image restoration method and device, electronic equipment and storage medium
CN111242874B (en) * 2020-02-11 2023-08-29 北京百度网讯科技有限公司 Image restoration method, device, electronic equipment and storage medium
CN111614974A (en) * 2020-04-07 2020-09-01 上海推乐信息技术服务有限公司 Video image restoration method and system
CN111614974B (en) * 2020-04-07 2021-11-30 上海推乐信息技术服务有限公司 Video image restoration method and system
CN112184566B (en) * 2020-08-27 2023-09-01 北京大学 An image processing method and system for removing attached water mist and water droplets
CN112184566A (en) * 2020-08-27 2021-01-05 北京大学 An image processing method and system for removing attached water mist and water droplets

Also Published As

Publication number Publication date
CN108171663B (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN108171663A (en) The image completion system for the convolutional neural networks that feature based figure arest neighbors is replaced
CN112419327B (en) Image segmentation method, system and device based on generation countermeasure network
CN108520503B (en) A method for repairing face defect images based on autoencoder and generative adversarial network
CN112784954B (en) Method and device for determining neural network
CN110689599B (en) 3D visual saliency prediction method based on non-local enhancement generation countermeasure network
CN111861945B (en) A text-guided image restoration method and system
CN111507150B (en) Face recognition method using multiple image block combination based on deep neural network
JP7263216B2 (en) Object Shape Regression Using Wasserstein Distance
CN109377452B (en) Face image restoration method based on VAE and generation type countermeasure network
CN110276264B (en) Crowd density estimation method based on foreground segmentation graph
CN112767226B (en) Image steganography method and system for automatically learning distortion based on GAN network structure
CN111027464B (en) Iris Recognition Method Jointly Optimized for Convolutional Neural Network and Sequential Feature Coding
CN109829959B (en) Facial analysis-based expression editing method and device
CN113298734B (en) A method and system for image inpainting based on hybrid hole convolution
CN109903236A (en) Face image restoration method and device based on VAE-GAN and similar block search
CN108681689B (en) Frame rate enhanced gait recognition method and device based on generation of confrontation network
CN114187638B (en) A method for facial expression recognition in real environment based on spatial distribution loss function
CN114820381B (en) A digital image restoration method based on structural information embedding and attention mechanism
CN114758293B (en) Deep learning crowd counting method based on auxiliary branch optimization and local density block enhancement
CN115908842A (en) Transformer Partial Discharge Data Enhancement and Recognition Method
CN116189281B (en) End-to-end human behavior classification method and system based on spatiotemporal adaptive fusion
CN112651360A (en) Skeleton action recognition method under small sample
Yang et al. Inversion based on a detached dual-channel domain method for StyleGAN2 embedding
CN114283265A (en) Unsupervised face correcting method based on 3D rotation modeling
CN116452904B (en) Image aesthetic quality determination method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant