南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (1): 1–9.doi: 10.13232/j.cnki.jnju.2021.01.001

• •    下一篇

结合马尔科夫场和格拉姆矩阵特征的写实类图像风格迁移

曾宪华1,2(), 陆宇喆1,2, 童世玥1,2, 徐黎明1,2   

  1. 1.重庆邮电大学计算机科学与技术学院,重庆,400050
    2.重庆市图像认知重点实验室,重庆,400050
  • 收稿日期:2020-08-28 出版日期:2021-01-30 发布日期:2021-01-21
  • 通讯作者: 曾宪华 E-mail:zengxh@cqupt.edu.cn
  • 作者简介:E⁃mail: zengxh@cqupt.edu.cn
  • 基金资助:
    国家自然科学基金(61672120);重庆市自然科学基金(cstc2019jcyj?zdxmX0011)

Photorealism style transfer combining MRFs⁃based and gram⁃based features

Xianhua Zeng1,2(), Yuzhe Lu1,2, Shiyue Tong1,2, Liming Xu1,2   

  1. 1.College of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing,400050,China
    2.Key Laboratory of Image Recognition,Chongqing,400050,China
  • Received:2020-08-28 Online:2021-01-30 Published:2021-01-21
  • Contact: Xianhua Zeng E-mail:zengxh@cqupt.edu.cn

摘要:

风格迁移是一门将参考图像的风格迁移到目标图像上的技术,但将风格迁移算法应用于写实类照片时,生成的图像却会因为纹理扭曲严重而表现得不真实或是生成的图像整体缺少美感,为了解决此类问题,提出一种基于卷积神经网络的风格迁移算法.首先,为了高效融合不同层信息作为特征表达使生成图像饱满丰富,用聚合方法结合了图像较浅层和较深层的特征;然后,使用全局风格损失和局部风格损失来构建总风格损失项,这样能使生成图像保持风格全局一致性,同时也保留了局部细节信息,其中全局风格损失是由格拉姆矩阵表达,而局部风格损失由马尔科夫随机场表达.为了限制图像结构的变化,将图像变化约束在颜色空间的局部仿射中.还提出一种基于神经网络的语义分割模块来约束图像不同语义区域处的纹理溢出,该模块自动生成输入图像的语义分割映射,节约人为手工构造语义区域的时间.实验结果表明,该方法在不同的风格场景下均能产生真实且美观的图像.

关键词: 风格迁移, 深度学习, 写实类照片, 自动语义分割

Abstract:

Neural style transfer is a technique to transfer a style image to a content image using the principle of deep learning. Recently,the approaches to transfer image style have been successful. However,there is an important disadvantage when style transfer is applied to photorealism style: the image structure is prone to change and the image style is unnatural. In order to improve the quality of generated image in photorealism style transfer,we present a neural style transfer method which is based on convolutional neural networks. Our approach aggregates multiple features from the shallowest layers and deeper layers to represent style image features. With a combination of global style loss and local style loss as total style loss. The global style loss is Gram?based which uses Gram matrices to represent global style features and the local style loss is based on MRFs (Markov Random Fields). In order to restrain the change of image structure,we preserve edges by constraining the transformations locally affine in color space. And our approach presents a semantic segmentation module based on neural network. It automatically generates semantic segmentation of the input image to constrain the image distortions. Experimental results show that our approach improves generated images quality of photorealism style transfer in variety of scenarios.

Key words: style transfer, deep learning, photorealism, automatic semantic segmentation

中图分类号: 

  • TP391.4

图1

本文方法与其他风格迁移方法[11,14]的比较"

图2

本文算法整体框架"

图3

DLA模块聚合过程"

图4

不同初始化图像(上)生成的结果图像(下)"

图5

局部风格损失权重的增长对生成图像的影响"

图6

仿射损失权重的增长对生成图像的影响"

图7

语义映射权重的增长对生成图像的影响"

图8

不同的损失项对生成图像的影响"

图9

本文的方法与其他风格迁移算法和颜色迁移算法的比较"

表1

风格迁移下生成图像的SSIM值比较"

Methodpic1pic2pic3pic4pic5pic6
Li[11]0.14390.56020.76260.14390.56930.2751
Luan[14]0.65790.67230.66040.66690.80300.7649
Penhouet[15]0.63550.77790.87430.94280.76750.6700
Ours0.79060.85940.72330.77220.82220.7722

表2

风格迁移下生成图像的FSIM值比较"

Methodpic1pic2pic3pic4pic5pic6
Li[11]0.69660.75340.87990.69670.74570.7222
Luan[14]0.86130.84070.86500.82690.89000.9266
Penhouet[15]0.78360.84840.94330.95290.80210.8065
Ours0.92420.91900.92560.95610.83050.8948
1 Jing Y C,Yang Y Z,Feng Z L,et al. Neural style transfer:a review. IEEE Transactions on Visualization and Computer Graphics,2019,doi:10.1109/TVCG.2019.2921336.
2 Hertzmann A,Jacobs C E,Oliver N,et al. Image analogies∥Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. New York,NY,USA:ACM,2001:327-340.
3 Reinhard E,Adhikhmin M,Gooch B,et al. Color transfer between images. IEEE Computer Graphics and Applications,2002,21(5):34-41.
4 Xiao X Z,Ma L Z. Color transfer in correlated color space∥Proceedings of 2006 ACM International Conference on Virtual Reality Continuum and Its Applications. New York,NY,USA:ACM,2006:305-309.
5 He L,Qi H R,Zaretzki R. Image color transfer to evoke different emotions based on color combinations. Signal,Image and Video Processing,2015,9(8):1965-1973.
6 Welsh T,Ashikhmin M,Mueller K. Transferring color to greyscale images∥Proceedings of the 29th annual Conference on Computer Graphics and Interactive Techniques. New York,NY,USA:ACM,2002:277-280.
7 Gatys L A,Ecker A S,Bethge M. Texture synthesis using convolutional neural networks∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge,MA,USA:ACM,2015:262-270.
8 Gatys L A,Ecker A S,Bethge M. Image style transfer using convolutional neural networks∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:2414-2423.
9 Risser E,Wilmot P,Barnes C. Stable and controllable neural texture synthesis and style transfer using histogram losses. 2017,arXiv:1701.08893.
10 Selim A,Elgharib M,Doyle L. Painting style transfer for head portraits using convolutional neural networks. ACM Transactions on Graphics,2016,35(4):129.
11 Li C,Wand M. Combining markov random fields and convolutional neural networks for image synthesis∥Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:2479-2486.
12 Champandard A J. Semantic style transfer and turning two?bit doodles into fine artworks. 2016,arXiv:1603.01768.
13 Wang Z Z,Zhao L,Xing W,et al. GLStyleNet:higher quality style transfer combining global and local pyramid features. 2018,arXiv:1811.07260.
14 Luan F J,Paris S,Shechtman E,et al. Deep photo style transfer∥Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:4990-4998.
15 Penhou?t S,Sanzenbacher P. Automated deep photo style transfer. 2019,arXiv:1901.03915.
16 Hinton G E. Reducing the dimensionality of data with neural networks. Science,2006,313(5786):504-507.
17 Yu F,Wang D Q,Shelhamer E,et al. Deep layer aggregation∥Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:2403-2412.
18 Levin A,Lischinski D,Weiss Y. A closed?form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,30(2):228-242.
19 Wang J D,Sun K,Cheng T H,et al. Deep high?resolution representation learning for visual recognition. 2019,arXiv:1908.07919.
20 Luan F J,Paris S,Shechtman E,et al. Deep painterly harmonization. Computer Graphics Forum,2018,37(4):95-106.
21 Simonyan K,Zisserman A. Very deep convolutional networks for large?scale image recognition. 2014,arXiv:1409.1556.
22 Johnson J,Alahi A,Li F F. Perceptual losses for real?time style transfer and super?resolution∥European Conference on Computer Vision. Springer Berlin Heidelberg,2016:694-711.
23 Li C,Wand M. Precomputed real?time texture synthesis with markovian generative adversarial networks∥European Conference on Computer Vision. Springer Berlin Heidelberg,2016:702-716.
24 Isola P,Zhu J Y,Zhou T H,et al. Image?to?image translation with conditional adversarial networks∥Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:1125-1134.
[1] 余方超, 方贤进, 张又文, 杨高明, 王丽. 增强深度学习中的差分隐私防御机制[J]. 南京大学学报(自然科学版), 2021, 57(1): 10-20.
[2] 张萌, 韩冰, 王哲, 尤富生, 李浩然. 基于深度主动学习的甲状腺癌病理图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 21-28.
[3] 李一凡, 朱斐, 凌兴宏, 刘全. 具有窗口结构Bi⁃LSTM网络的心电图QRS波检测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 42-51.
[4] 温玉莲, 林培光. 基于行业背景差异下的金融时间序列预测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 90-100.
[5] 潘越,王骏,李文飞,张建,王炜. 基于卷积神经网络的蛋白质折叠类型最小特征提取[J]. 南京大学学报(自然科学版), 2020, 56(5): 744-753.
[6] 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600.
[7] 李康,谢宁,李旭,谭凯. 基于卷积神经网络和几何优化的统计染色体核型分析方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 116-124.
[8] 韩普,刘亦卓,李晓艳. 基于深度学习和多特征融合的中文电子病历实体识别研究[J]. 南京大学学报(自然科学版), 2019, 55(6): 942-951.
[9] 张家精,夏巽鹏,陈金兰,倪友聪. 基于张量分解和深度学习的混合推荐算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 952-959.
[10] 钟琪,冯亚琴,王蔚. 跨语言语料库的语音情感识别对比研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 765-773.
[11] 王蔚, 胡婷婷, 冯亚琴. 基于深度学习的自然与表演语音情感识别[J]. 南京大学学报(自然科学版), 2019, 55(4): 660-666.
[12] 张鹏,黄毅,阮雅端,陈启美*. 基于稀疏特征的交通流视频检测算法[J]. 南京大学学报(自然科学版), 2015, 51(2): 264-270.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!