A Transformer-Based Approach for Video Deepfake Detection
Main Article Content
Abstract
The rapid advancement of Deepfake technology has made it possible to generate videos with realistic facial manipulations, raising serious concerns about the authenticity of digital content. This work proposes HDU-DFNet, a Transformer-based deep learning model designed for the automatic detection of Deepfake videos. Experimental results indicate that the model achieves superior accuracy and generalization performance when compared with conventional CNN architectures, such as ResNet50. The model effectively identifies fine-grained facial inconsistencies and blending artifacts arising from face-swapping operations. In addition, interpretability analyses are applied to clarify the model’s reasoning process, highlighting key facial regions associated with forgery detection.
Keywords
deepfake detection, deep learning, image classification
Article Details
References
Deep learning, Journal of educational equipment: Applied research, 2(299).
[2] Yan, Z., Zhang, Y., Yuan, X., Lyu, S., & Wu, B. D. (2023), A comprehensive benchmark of deepfake detection, arXiv preprint arXiv:2307.01426.
[3] Tuan, L. M., Manh, P. T., & Linh, D. T. T. (2023), Deepfake detection based on deep learning, TNU Journal of Science and Technology, 228(15): 88 - 95.
[4] Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C.
(2020). The deepfake detection challenge (dfdc) dataset, arXiv preprint
arXiv:2006.07397.
[5] Altuncu, E., Franqueira, V., & Li, S. (2022). Deepfake: Definitions, performance metrics and standards, datasets and benchmarks, and a meta-review. arXiv. org.
[6] Pei, G., Zhang, J., Hu, M., Zhang, Z., Wang, C., Wu, Y., ... & Tao, D. (2024). Deepfake generation and detection: A benchmark and survey, arXiv preprint arXiv:2403.17881.
[7] Nguyen, T. T., Nguyen, Q. V. H., Nguyen, D. T., Nguyen, D. T., Huynh-The, T., Nahavandi, S., ... & Nguyen, C. M. (2022). Deep learning for deepfakes creation and detection: A survey. Computer Vision and Image Understanding, 223, 103525.
[8] Li, Y., & Lyu, S. (2018). Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656.
[9] Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018, December). Mesonet: a compact facial video forgery detection network. IEEE international workshop on information forensics and security (WIFS) (pp. 1-7).
[10] Kwon, P., You, J., Nam, G., Park, S., & Chae, G. (2021). Kodf: A large-scale korean deepfake detection dataset, IEEE/CVF international conference on computer vision (pp. 10744-10753).
[11] Ni, Y., Meng, D., Yu, C., Quan, C., Ren, D., & Zhao, Y. (2022). Core: Consistent representation learning for face forgery detection . IEEE/CVF conference on computer vision and pattern recognition (pp. 12-21).
[12] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
[13] Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & NieBner, M. (2019). Faceforensics++: Learning to detect manipulated facial images. EEE/CVF international conference on computer vision (pp. 1-11).
[14] X., “140k real and fake faces,” 2020. [Online]. Available:
https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces
[15] Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020). Celeb-df: A large-scale challenging dataset for deepfake forensics. IEEE/CVF conference on computer vision and pattern recognition (pp. 3207-3216).
[16] Le, T. N., Nguyen, H. H., Yamagishi, J., & Echizen, I. (2021). Openforensics: Large- scale challenging dataset for multi-face forgery detection and segmentation in-the- wild, International conference on computer vision - ICCV (pp. 10117-10127).