ĐÁNH GIÁ HIỆU QUẢ CÁC MÔ HÌNH PHÁT HIỆN ĐỐI TƯỢNG DỰA TRÊN CƠ CHẾ HỢP NHẤT ĐẶC TRƯNG TRONG BỐI CẢNH ẢNH UAV

Dũng Nguyễn; Nguyen Ngoc Thuy Nguyen; Bui Luong Vu Ngoc Bui Luong Vu Ngoc

doi:10.70117/hdujs.84.2.2026.1146

pdf

Issue: Số 84-02.2026: Chuyên ngành Khoa học Tự nhiên, Kỹ thuật và Công nghệ

Section: Khoa học Tự nhiên và Công nghệ

DOI: 10.70117/hdujs.84.2.2026.1146

Date Published: 25/03/2026

Views 4

Downloads 1

How to Cite

Nguyễn, D., Nguyen, N. N. T., & Bui Luong Vu Ngoc, B. L. V. N. (2026). EVALUATING THE EFFECTIVENESS OF FEATURE FUSION–BASED OBJECT DETECTION MODELS IN UAV IMAGERY. Hong Duc University Journal of Science, 84(2), 20-29. https://doi.org/10.70117/hdujs.84.2.2026.1146

Citation format:

EVALUATING THE EFFECTIVENESS OF FEATURE FUSION–BASED OBJECT DETECTION MODELS IN UAV IMAGERY

Dũng Nguyễn^1,, Nguyen Ngoc Thuy Nguyen², Bui Luong Vu Ngoc Bui Luong Vu Ngoc³
¹ Trưởng Đại học Khoa học, Đại học Huế
² Hong Duc University
³ Phân hiệu Trường Đại học Y Hà Nội tại tỉnh Thanh Hoá

Abstract

Object detection from the UAV perspective has attracted increasing attention due to its importance in applications such as traffic monitoring, smart agriculture, and environmental observation. However, UAV imagery often contains small, densely distributed objects with frequent occlusions and complex backgrounds, posing significant challenges. This paper conducts a comprehensive survey and experimental evaluation of modern object detection models based on CNNs and Transformers in UAV scenarios, using the VisDrone2019, TinyPerson, and HIT-UAV benchmarks. The results reveal a clear trade-off between detection accuracy and computational cost, while recent approaches such as adaptive attention mechanisms and multi-scale feature pyramid architectures demonstrate strong potential for achieving a favorable balance between performance and efficiency in UAV deployment.

Keywords

UAV, phát hiện đối tượng, học sâu, cơ chế chú ý, tháp đặc trưng đa mức

References

[1] D. Du et al. (2019), VisDrone-DET2019: The vision meets drone object detection in image challenge results, in Proceedings of the IEEE/CVF international conference on computer vision workshops, pp.213-226.
[2] T.-Y. Lin et al. (2014), Microsoft coco: Common objects in context, in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, Springer, pp.740-755.
[3] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi (2016), You only look once: Unified, real-time object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp.779-788.
[4] G. J. a. A. C. a. J. Qiu (2023), Ultralytics YOLOv8, Available: https://github.com/ultralytics/ultralytics.
[5] A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, and G. Ding (2024), Yolov10: Real-time end-to-end object detection, Advances in Neural Information Processing Systems, vol. 37, pp.107984-108011.
[6] R. Khanam and M. Hussain (2024), Yolov11: An overview of the key architectural enhancements, arXiv preprint arXiv:2410.17725.
[7] S. Ren, K. He, R. Girshick, and J. Sun (2015), Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, vol. 28.
[8] Z. Cai and N. Vasconcelos (2018), Cascade r-cnn: Delving into high quality object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp.6154-6162.
[9] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko (2020), End-to-end object detection with transformers, in European conference on computer vision, Springer, pp.213-229.
[10] X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai (2020), Deformable detr: Deformable transformers for end-to-end object detection, arXiv preprint arXiv:2010.04159.
[11] Y. Zhao et al. (2024), Detrs beat yolos on real-time object detection, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16965-16974.
[12] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie (2017), Feature pyramid networks for object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2117-2125.
[13] K. He, G. Gkioxari, P. Dollár, and R. Girshick (2017), Mask r-cnn, in Proceedings of the IEEE international conference on computer vision, pp.2961-2969.
[14] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár (2017), Focal loss for dense object detection, in Proceedings of the IEEE international conference on computer vision, pp.2980-2988.
[15] B. Zhang and Y. Zhang (2025), UAV Small Object Detection Algorithm Based on Dynamic Feature Aggregation and Hierarchical Attention Mechanism, IEEE Access.
[16] M. Tan, R. Pang, and Q. V. Le (2020), Efficientdet: Scalable and efficient object detection, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.10781-10790.
[17] J. Suo, T. Wang, X. Zhang, H. Chen, W. Zhou, and W. Shi (2023), HIT-UAV: A high-altitude infrared thermal dataset for Unmanned Aerial Vehicle-based object detection, Scientific Data, 10(1), pp.227.
[18] M. Chao, C. Peng, L. Yun, C. Zhang, H. Wang, and Z. Chen (2025), A lightweight small object detection model for UAV images based on deep semantic integration, Scientific Reports, 15(1), pp. 31888.
[19] X. Yu, Y. Gong, N. Jiang, Q. Ye, and Z. Han (2020), Scale match for tiny person detection, in Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp.1257-1265.

Article Sidebar

Main Article Content

Abstract

Keywords

Article Details

References