生成式图像水印
李金合,杨豪中,王宏霞
一、生成式图像水印研究背景
图1 生成图像与真实图像示例图
二、生成式图像水印算法研究进展
图2 关键图像生成模型发展时间线
1. 基于流模型的方案
2. 基于生成对抗网络的方案
3. 基于扩散模型的方案
(1)修改图像数据
(2)调整生成模型
(3)修改隐变量空间
三、算法的性能与评价指标
四、常用数据集
表1 常用图像数据集对比
名称 | 数据规模 | 图像类别 | 备注 |
MS-COCO | 287K | 自然图像 | 分辨率不统一 |
CIFAR10 | 60K | 自然图像 | 分辨率为32×32 |
DIV2K | 1K | 自然图像 | 分辨率为2K左右 |
ILSVRC | 1431167 | 自然图像 | 分辨率不统一 |
LSUN | 1M | 自然图像 | 分辨率为256×256 |
CelebA | 202599 | 人脸图像 | 分辨率不统一 |
CelebA-HQ | 30K | 人脸图像 | 分辨率为1024×1024 |
FFHQ | 70K | 人脸图像 | 分辨率为1024×1024 |
AFHQ | 15K | 动物面部图像 | 分辨率为512×512 |
WikiArt | 81444 | 艺术作品 | 分辨率不统一 |
DiffusionDB | 14M | 文本-图像对 | 多模态数据集、多语种文本 |
LAION-400M | 400M | 文本-图像对 | 多模态数据集、英文文本 |
LAION-5B | 5B | 文本-图像对 | 多模态数据集、多语种文本 |
注:K表示一千,M表示一百万,B表示十亿。
五、本章小结
参考文献
[1] T. Wang, Y. Zhang, S. Qi, et al, “Security and Privacy on Generative Data in AIGC: A Survey,” arXiv preprint, 2023, arXiv:2309.09435.
[2] C. Chen, Z. Wu, Y. Lai, et al, “Challenges and Remedies to Privacy and Security in AIGC: Exploring the Potential of Privacy Computing, Blockchain, and Beyond,” arXiv preprint , 2023, arXiv:2306.00419.
[3] I. Goodfellow, J. Pouget-Abadie, M. Mirza, et al, “Generative Adversarial Nets,” Advances in Neural Information Processing Systems, 2014, pp. 2672-2680.
[4] D.P. Kingma, M. Welling, “Auto-Encoding Variational Bayes,” In Proceedings of the International Conference on Learning Representations, 2013, pp. 1-14.
[5] L. Dinh, D. Krueger, Y. Bengio, “NICE: Non-Linear Independent Components Estimation,” arXiv preprint, 2014, arXiv:1410.8516.
[6] L. Dinh, J. Sohl-Dickstein, S. Bengio, “Density Estimation Using Real NVP,” arXiv preprint, 2016, arXiv:1605.08803.
[7] J. Ho, A. Jain, P. Abbeel, “Denoising Diffusion Probabilistic Models,” Advances in Neural Information Processing Systems, 2020, pp. 6840-6851.
[8] J. Song, C. Meng, S. Ermon, “Denoising Diffusion Implicit Models,” In Proceedings of the International Conference on Learning Representations, 2021, pp. 1-22.
[9] C. Lu, Y. Zhou, F. Bao, et al, “DPM-solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps,” Advances in Neural Information Processing Systems, 2022, pp. 5775-5787.
[10] A. Radford, J.W. Kim, C. Hallacy, et al, “Learning Transferable Visual Models from Natural Language Supervision,” In Proceedings of the International Conference on Machine Learning, 2021, pp. 8748-8763.
[11] R. Rombach, A. Blattmann, D. Lorenz, et al, “High-Resolution Image Synthesis with Latent Diffusion Models,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 10684-10695.
[12] A. Ramesh, M. Pavlov, G. Goh, et al, “Zero-Shot Text-to-Image Generation,” In Proceedings of the International Conference on Machine Learning, 2021, pp. 8821-8831.
[13] D. Podell, Z. English, K. Lacey, et al, “SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis,” In Proceedings of the International Conference on Learning Representations, 2024, pp. 1-13.
[14] Y. Song, P. Dhariwal, M. Chen, et al, “Consistency Models,” arXiv preprint, 2023, arXiv:2303.01469.
[15] Z. Guan, J. Jing, X. Deng, et al, “DeepMIH: Deep Invertible Network for Multiple Image Hiding,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(1), pp. 372-390.
[16] Y. Xu, C. Mou, Y. Hu, et al, “Robust Invertible Image Steganography,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7875-7884.
[17] Y. Lan, F. Shang, J. Yang, et al, “Robust Image Steganography: Hiding Messages in Frequency Coefficients,” In Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(12), pp. 14955-14963.
[18] J. Jing, X. Deng, M. Xu, et al, “Hinet: Deep Image Hiding by Invertible Network,” In Proceedings of the IEEE International Conference on Computer Vision, 2021, pp. 4733-4742.
[19] C. Zhang, P. Benz, A. Karjauv, et al, “UDH: Universal Deep Hiding for Steganography, Watermarking, and Light Field Messaging,” Advances in Neural Information Processing Systems, 2020, pp.10223-10234.
[20] H. Fang, Y. Qiu, K. Chen, et al, “June. Flow-based Robust Watermarking with Invertible Noise Layer for Black-Box Distortions,” In Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(4), pp. 5054-5061.
[21] Y. Luo, T. Zhou, F. Liu, et al, “IRWArt: Levering Watermarking Performance for Protecting High-quality Artwork Images,” In Proceedings of the ACM Web Conference, 2023, pp. 2340-2348.
[22] K. Hao, G. Feng, X. Zhang, “Robust Image Watermarking Based on Generative Adversarial Network,” China Communications, 2020, 17(11), pp. 131-140.
[23] J. Fei, Z. Xia, B. Tondi, et al, “Supervised GAN Watermarking for Intellectual Property Protection,” In Proceedings of IEEE International Workshop on Information Forensics and Security, 2022, pp. 1-6.
[24] J. Huang, T. Luo, L. Li, et al, “ARWGAN: Attention-Guided Robust Image Watermarking Model Based on GAN,” IEEE Transactions on Instrumentation and Measurement, 2023, 72, pp. 1-17.
[25] D.S. Ong, C.S. Chan, K.W. Ng, et al, “Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attacks,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 3630-3639.
[26] N. Lukas, F. Kerschbaum, “{PTW}: Pivotal Tuning Watermarking for {Pre-Trained} Image Generators,” In 32nd USENIX Security Symposium, 2023, pp. 2241-2258.
[27] N. Yu, V. Skripniuk, S. Abdelnabi, et al, “Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data,” In Proceedings of the IEEE International Conference on Computer Vision, 2021, pp. 14448-14457.
[28] Y. Zhao, T. Pang, C. Du, et al, “A Recipe for Watermarking Diffusion Models,” arXiv preprint, 2023, arXiv:2303.10137.
[29] Y. Cui, J. Ren, H. Xu, et al, “DiffusionShield: A Watermark for Copyright Protection Against Generative Diffusion Models,” arXiv preprint, 2023, arXiv:2306.04642.
[30] Y. Cui, J. Ren, Y. Lin, et al, “FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models,” arXiv preprint, 2023, arXiv:2310.02401.
[31] Z. Wang, C. Chen, L. Lyu, et al, “DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models,” In Proceedings of the International Conference on Learning Representations, 2024, pp. 1-21.
[32] P. Zhu, T. Takahashi, H. Kataoka, “Watermark-embedded Adversarial Examples for Copyright Protection Against Diffusion Models,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 24420-24430.
[33] H. Liu, Z. Sun, Y. Mu, “Countering Personalized Text-to-Image Generation with Influence Watermarks,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 12257-12267.
[34] P. Fernandez, G. Couairon, H. Jégou, et al, “The Stable Signature: Rooting Watermarks in Latent Diffusion Models,” In Proceedings of the IEEE International Conference on Computer Vision, 2023, pp. 22466-22477.
[35] Y. Ma, Z. Zhao, X. He, et al, “Generative Watermarking Against Unauthorized Subject-Driven Image Synthesis,” arXiv preprint, 2023, arXiv:2306.07754.
[36] C. Xiong, C. Qin, G. Feng, et al, “Flexible and Secure Watermarking for Latent Diffusion Model,” In Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 1668-1676.
[37] Z. Meng, B. Peng, J. Dong, “Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space,” arXiv preprint, 2024, arXiv:2404.00230.
[38] C. Kim, K. Min, M. Patel, et al, “WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 8974-8983.
[39] X. Zhang, R. Li, J. Yu, et al, “EditGuard: Versatile Image Watermarking for Tamper Localization and Copyright Protection,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 11964-11974.
[40] Y. Wen, J. Kirchenbauer, J. Geiping, et al, “Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust,” Advances in Neural Information Processing Systems, 2023, pp. 1-17.
[41] L. Zhang, X. Liu, A.V. Martin, et al, “Robust Image Watermarking Using Stable Diffusion,” arXiv preprint, 2024, arXiv:2401.04247.
[42] G.H. Liu, T. Chen, E. Theodorou, et al, “Mirror Diffusion Models for Constrained and Watermarked Generation,” Advances in Neural Information Processing Systems, 2023, pp. 1-20.
[43] G. Zhang, L. Wang, Y. Su, et al, “A Training-Free Plug-and-Play Watermark Framework for Stable Diffusion,” arXiv preprint, 2024, arXiv:2404.05607.
[44] Z. Yang, K. Zeng, K. Chen, et al, “Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 12162-12171.
[45] M. Heusel, H. Ramsauer, T. Unterthiner, et al, “Gans Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium,” Advances in Neural Information Processing Systems, 2017, pp. 1-12.
[46] R. Zhang, P. Isola, A.A. Efros, et al, “The Unreasonable Effectiveness of Deep Features as a Perceptual Metric,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 586-595.
[47] T.Y. Lin, M. Maire, S. Belongie, et al, “Microsoft COCO: Common Objects in Context,” In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 2014, pp. 740-755.
[48] A. Krizhevsky, G. Hinton, “Learning Multiple Layers of Features from Tiny Images,” 2009.
[49] T. Karras, S. Laine, T. Aila, “A Style-based Generator Architecture for Generative Adversarial Networks,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401-4410.
[50] Y. Choi, Y. Uh, J. Yoo, et al, “Stargan v2: Diverse Image Synthesis for Multiple Domains,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 8188-8197.
[51] J. Deng, W. Dong, R. Socher, et al, “Imagenet: A Large-scale Hierarchical Image Database,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248-255.
[52] C. Schuhmann, R. Vencu, R. Beaumont, et al, “LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs,” arXiv preprint, 2021, arXiv:2111.02114.
[53] Z. Liu, P. Luo, X. Wang, et al, “Deep Learning Face Attributes in the Wild,” In Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 3730-3738.
[54] T. Karras, T. Aila, S. Laine, et al, “Progressive Growing of GANs for Improved Quality, Stability, and Variation,” In Proceedings of the International Conference on Learning Representations, 2018, pp. 1-26.
[55] B. Saleh, A. Elgammal, “Large-scale Classification of Fine-art Paintings: Learning the Right Metric on the Right Feature,” International Journal for Digital Art History, 2016, 2, pp.1-26.
[56] Z. J. Wang, E. Montoya, D. Munechika, et al. “DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models,” In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 1, pp. 893-911.
[57] R. Timofte, E. Agustsson, L. Van Gool, et al, “NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 1110-1121.
[58] F. Yu, A. Seff, Y. Zhang, et al, “LSUN: Construction of a Large-Scale Image Dataset using Deep Learning with Humans in the Loop,” arXiv preprint, 2015, arXiv:1506.03365.
[59] C. Schuhmann, R. Beaumont, R. Vencu, et al. “LAION-5B: An Open Large-scale Dataset for Training Next Generation Image-Text Models,” Advances in Neural Information Processing Systems, 2022, pp. 25278-25294.