FEATURE ENGINEERING WITH CNN MODELS FOR PARTIAL VIDEO COPY DETECTION

Van Hao Le , Dinh Nghiep Le1, Van Cuong Nguyen1
1 Hong Duc University

Main Article Content

Abstract

2D convolutional neural networks are the key component in partial video copy detection systems. They play a crucial role in video retrieval and matching tasks within a large database. However, the performance characteristics of these feature extraction methods have been little discussed in the literature. This paper presents two key contributions. First, we conduct the experiments on a large-scale dataset to demonstrate the generalization capability and clarify the performance characteristics of popular neural networks. Next, we propose a time-series model approach to highlight the advantages and limitations of image features extracted from neural networks in the partial video copy detection problem.

Article Details

References

[1] Jiang, Y., Wang, J. (2016), Partial copy detection in videos: A benchmark and an evaluation of popular methods. IEEE Transactions on Big Data 2(1), pp. 32-42.
[2] Zhang, X., Gao, J. (2020), Measuring feature importance of convolutional neural networks. IEEE Access 8, 196062-196074.
[3] Le, V.H., Delalandre, M., Cardot, H. (2023), Performance Characterization of 2D CNN Features for Partial Video Copy Detection, Conference on Computer Analysis of Images and Patterns (CAIP), pp. 205-215.
[4] Kordopatis-Zilos, G., Papadopoulos, S., Patras, I., Kompatsiaris, I. (2019), Fivr: Finegrained incident video retrieval, IEEE Transactions on Multimedia 21(10), 2638 – 2652.
[5] Kordopatis-Zilos, G., Papadopoulos, S., Patras, I., Kompatsiaris, Y. (2017), Nearduplicate video retrieval with deep metric learning. In: ICCV. pp. 347-356.
[6] Roy, P., Ghosh, S., Bhattacharya, S., Pal, U. (2023), Effects of degradations on deep neural network architectures, In: Open-access repository (arXiv). No. 1807.10108v5.
[7] Tolias, G., Sicre, R., J´egou, H. (2016), Particular object retrieval with integral maxpooling of cnn activations. In: ICLR. pp. 1-12.
[8] Cheng, H., Wang, P., Qi, C. (2021), Cnn features based unsupervised metric learning for near-duplicate video retrieval. In: Open-access repository (arXiv). No. 2105.14566v1.
[9] Zhang, C., Hu, B., Suo, Y., Zou, Z., Ji, Y. (2020), Large-scale video retrieval via deep local convolutional features. Advances in Multimedia 2020, 1687-5680.
[10] Gkelios, S., Sophokleous, A., Plakias, S., Boutalis, Y., Chatzichristofis, S. (2021), Deep convolutional features for image retrieval. Expert Systems With Applications 177(114940).
[11] He, S., Yang, X., Jiang, C., Liang, G., Zhang, W., Pan, T., Wang, Q., Xu, F., Li, C., Liu, J., et al. (2022), A large-scale comprehensive dataset and copy-overlap aware evaluation protocol for segment-level video copy detection. In: CVPR. pp. 21086-21095.
[12] Le, V.H., Delalandre, M., Conte, D. (2022), A large-scale tv dataset for partial video copy detection. In: ICIAP. vol. 13233, pp. 388-399.
[13] Tan, W., Guo, H., Liu, R. (2022), A fast partial video copy detection using knn and global feature database, In: WACV. pp. 2191-2199.
[14] Jiang, Q., He, Y., Li, G., Lin, J., Li, L., Li, W. (2019), Svd: A large-scale short video dataset for near-duplicate video retrieval. In: ICCV. pp. 5281-5289.
[15] Cools, A., Belarbi, M., Mahmoudi, S. (2022), A comparative study of reduction methods applied on a convolutional neural network. Electronics 11, 1422.
[16] Han, Z., He, X., Tang, M., LV, Y. (2021), Video similarity and alignment learning on partial video copy detection, In: MM. pp. 4165-4173.
[17] He, S., He, Y., Lu, M., Jiang, C., Yang, X., Qian, F., Zhang, X., Yang, L., Zhang, J. (2023), Transvcl: Attention-enhanced video copy localization network with flexible supervision. In: AAAI.
[18] Wang, K., Cheng, C., Chen, Y., Song, Y., Lai, S. (2021), Attention-based deep metric learning for near-duplicate video retrieval. In: ICPR. pp. 5360-5367.
[19] Wang, L., Bao, Y., Li, H., Fan, X., Luo, Z. (2017), Compact cnn based video representation for efficient video copy detection. In: MMM. pp. 576-587.
[20] Zhao, G., Zhang, B., Zhang, M., Li, Y., Liu, J., Wen, J. (2022), Star-gnn: spatial-temporal video representation for content-based retrieval, In: ICME. pp. 01-06.
[21] Zhang, X., Xie, Y., Luan, X., He, J., Zhang, L., Wu, L. (2018), Video copy detection based on deep cnn features and graph-based sequence matching, Wireless Personal Communications 103(1), 401-416.
[22] Jiang, C., Huang, K., He, S., Yang, X., Zhang, W., Zhang, X., Cheng, Y., Yang, L., Wang, Q., Xu, F. (2021), Learning segment similarity and alignment in large-scale content-based video retrieval. In: MM. pp. 1618-1626.
[23] He, K., Zhang, X., Ren, S., Sun, J. (2016), Deep residual learning for image recognition. In: CVPR. pp. 770-778