Citation: | ZHUO Li, ZHANG Shiyu, ZHANG Hui, LI Jiafeng. Survey on Techniques of Single Object Tracking in Unmanned Aerial Vehicle Imagery[J]. Journal of Beijing University of Technology, 2021, 47(10): 1174-1187. DOI: 10.11936/bjutxb2020030017 |
With the rapid development of the unmanned aerial vehicle (UAV) industry, the dramatic increase in aerial imagery data has made intelligent analysis and processing of aerial images a new research focus. Object tracking, as one of the core technologies, provides fundamental support for further imagery content understanding and various practical applications. Affected by various factors such as complex application scenarios, frequent changes in target scale, target posture change, and similar target interference, object tracking in UAV imagery faces many technical challenges. The main techniques of single object tracking in UAV imagery in recent years, including object tracking methods based on correlation filter, deep learning, as well as combination of correlation filter and deep learning, were summarized and the public datasets of UAV imagery and evaluation metrics for object tracking performance were discussed. Then, the performance evaluation and analysis of typical single object tracking methods were performed. Finally, the future development tendency of object tracking in UAV imagery was summarized and prospected.
[1] |
胡健波, 张健. 无人机遥感在生态学中的应用进展[J]. 生态学报, 2018, 38(1): 20-30. https://www.cnki.com.cn/Article/CJFDTOTAL-STXB201801003.htm
HU J B, ZHANG J. Unmanned aerial vehicle remote sensing in ecology: advances and prospects[J]. Acta Ecologica Sinica, 2018, 38(1): 20-30. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-STXB201801003.htm
|
[2] |
HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. doi: 10.1109/TPAMI.2014.2345390
|
[3] |
RIFKIN R, YEO G, POGGIO T. Regularized least-squares classification[J]. Nato Science Series Sub Series Ⅲ: Computer and Systems Sciences, 2003, 190: 131-154
|
[4] |
GRAY R M. Toeplitz and circulant matrices: a review[J]. Foundations and Trends in Communications and Information Theory, 2006, 2(3): 155-239. http://uploads.tombertalan.com/12fall2012/13spring2013/520apc520/hw/hw1/toeplitzMatrixTheory.pdf
|
[5] |
DAVIS P J. Circulant matrices[M]. Providence, USA: American Mathematical Society, 2013: 1-250.
|
[6] |
SCHÖLKOPF B, SMOLA A J, BACH F. Learning with kernels: support vector machines, regularization, optimization, and beyond[M]. Cambridge: MIT Press, 2002: 405-406.
|
[7] |
BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2010: 2544-2550.
|
[8] |
HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]//12th European Conference on Computer Vision. Berlin: Springer, 2012: 702-715.
|
[9] |
YADAV R, SENTHAMILARASU V, KUTTY K, et al. A review on day-time pedestrian detection[J/OL]. SAE Technical Paper 2015-01-0311, 2015[2019-04-14]. https://doi.org/10.4271/2015-01-0311.
|
[10] |
DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2005: 886-893.
|
[11] |
DANELLJAN M, KHAN F S, FELSBERG M, et al. Adaptive color attributes for real-time visual tracking[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 1090-1097.
|
[12] |
VAN DE WEIJER J, SCHMID C, VERBEEK J, et al. Learning color names for real-world applications[J]. IEEE Transactions on Image Processing, 2009, 18(7): 1512-1523. doi: 10.1109/TIP.2009.2019809
|
[13] |
BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: complementary learners for real-time tracking[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1401-1409.
|
[14] |
LI Y, ZHU J. A scale adaptive kernel correlation filter tracker with feature integration[C]//The 13th European Conference on Computer Vision Workshops. Berlin: Springer, 2015: 254-265.
|
[15] |
DANELLJAN M, HÄGER G, KHAN F S, et al. Discriminative scale space tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1561-1575. doi: 10.1109/TPAMI.2016.2609928
|
[16] |
DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]//2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 4310-4318.
|
[17] |
GALOOGAHI H K, FAGG A, LUCEY S. Learning background-aware correlation filters for visual tracking[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1144-1152.
|
[18] |
BOYD S, PARIKH N, CHU E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine Learning, 2011, 3(1): 1-122. http://home.ustc.edu.cn/~liweiyu/documents/ADMM_20180530.pdf
|
[19] |
HUANG Z, FU C, LI Y, et al. Learning aberrance repressed correlation filters for real-time uav tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE, 2019: 2891-2900.
|
[20] |
朱建章, 王栋, 卢湖川. 背景与时间感知的相关滤波实时视觉跟踪[J]. 中国图象图形学报, 2019, 24(4): 536-549. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201904005.htm
ZHU J Z, WANG D, LU H C. Learning background-temporal-aware correlation filter for real-time visual tracking[J]. Journal of Image and Graphics, 2019, 24(4): 536-549. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201904005.htm
|
[21] |
LI Y, FU C, DING F, et al. Augmented memory for correlation filters in real-time UAV tracking[EB/OL]. [2019-10-27]. https://arxiv.org/pdf/1909.10989.
|
[22] |
LI F, FU C, LIN F, et al. Training-set distillation for real-time uav object tracking[EB/OL]. [2020-03-12]. https://arxiv.org/pdf/2003.05326.
|
[23] |
WANG Y, DING L, LAGANIERE R. Real-time UAV tracking based on PSR stability[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE, 2019: 144-152.
|
[24] |
DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255.
|
[25] |
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations. [S. l. ]: International Conference on Learning Representations, 2015: 1-14.
|
[26] |
CHATFIELD K, SIMONYAN K, VEDALDI A, et al. Return of the devil in the details: Delving deep into convolutional nets[EB/OL]. [2018-12-05]. https://arxiv.org/pdf/1405.3531.
|
[27] |
HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.
|
[28] |
HELD D, THRUN S, SAVARESE S. Learning to track at 100 FPS with deep regression networks[C]//The 14th European Conference on Computer Vision. Berlin: Springer, 2016: 749-765.
|
[29] |
JIA Y, SHELHAMER E, DONAHUE J, et al. Caffe: convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia. New York: ACM, 2014: 675-678.
|
[30] |
LI B, YAN J, WU W, et al. High performance visual tracking with siamese region proposal network[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980.
|
[31] |
ZHU Z, WANG Q, LI B, et al. Distractor-aware siamese networks for visual object tracking[C]//The 15th European Conference on Computer Vision. Berlin: Springer, 2018: 103-119.
|
[32] |
任珈民, 宫宁生, 韩镇阳. 一种改进的基于孪生卷积神经网络的目标跟踪算法[J]. 小型微型计算机系统, 2019, 40(12): 2686-2690. https://www.cnki.com.cn/Article/CJFDTOTAL-XXWX201912038.htm
REN J M, GONG N S, HAN Z Y. Improved target tracking algorithm based on siamese convolution neural network[J]. Journal of Chinese Computer Systems, 2019, 40(12): 2686-2690. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-XXWX201912038.htm
|
[33] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141.
|
[34] |
ZHANG Z, PENG H. Deeper and wider siamese networks for real-time visual tracking[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4586-4595.
|
[35] |
LI B, WU W, WANG Q, et al. Siamrpn++: evolution of siamese visual tracking with very deep networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4277-4286.
|
[36] |
BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking[C]//The 14th European Conference on Computer Vision Workshops. Berlin: Springer, 2016: 850-865.
|
[37] |
WANG Q, TENG Z, XING J, et al. Learning attentions: residual attentional siamese network for high performance online visual tracking[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4854-4863.
|
[38] |
NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4293-4302.
|
[39] |
NAM H, BAEK M, HAN B. Modeling and propagating cnns in a tree structure for visual tracking[EB/OL]. [2018-12-06]. https://arxiv.org/pdf/1608.07242.
|
[40] |
ZHANG Y, WANG D, WANG L, et al. Learning regression and verification networks for long-term visual tracking[EB/OL]. [2019-01-06]. https://arxiv.org/pdf/1809.04320.
|
[41] |
DANELLJAN M, BHAT G, KHAN F S, et al. Atom: accurate tracking by overlap maximization[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4655-4664.
|
[42] |
JIANG B, LUO R, MAO J, et al. Acquisition of localization confidence for accurate object detection[C]//The 15th European Conference on Computer Vision. Berlin: Springer, 2018: 816-832.
|
[43] |
ZHANG W, WANG H, HUANG Z, et al. Accuracy and long-term tracking via overlap maximization integrated with motion continuity[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway: IEEE, 2019: 109-117.
|
[44] |
WU H, YANG X, YANG Y, et al. Flow guided short-term trackers with cascade detection for long-term tracking[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway: IEEE, 2019: 170-178.
|
[45] |
DANELLJAN M, HÄGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]//2015 IEEE International Conference on Computer Vision Workshop. Piscataway: IEEE, 2015: 621-629.
|
[46] |
MA C, HUANG J, YANG X, et al. Hierarchical convolutional features for visual tracking[C]//2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 3074-3082.
|
[47] |
DANELLJAN M, ROBINSON A, SHAHBAZ K F, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]//The 14th European Conference on Computer Vision. Berlin: Springer, 2016: 472-488.
|
[48] |
DANELLJAN M, BHAT G, KHAN F S, et al. Eco: efficient convolution operators for tracking[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6931-6939.
|
[49] |
李国友, 张凤煦, 纪执安. 采用高效卷积算子的长期目标追踪算法[J]. 小型微型计算机系统, 2019, 40(9): 1951-1955. https://www.cnki.com.cn/Article/CJFDTOTAL-XXWX201909026.htm
LI G Y, ZHANG F X, JI Z A. Long-term tracking based on efficient convolution operator[J]. Journal of Chinese Computer Systems, 2019, 40(9): 1951-1955. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-XXWX201909026.htm
|
[50] |
LI Y, FU C, HUANG Z, et al. Keyfilter-aware real-time uav object tracking[EB/OL]. [2020-03-16]. https://arxiv.org/pdf/2003.05218.
|
[51] |
VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5000-5008.
|
[52] |
WANG Q, GAO J, XING J, et al. Dcfnet: discriminant correlation filters network for visual tracking[EB/OL]. [2018-03-24]. https://arxiv.org/pdf/1704.04057.
|
[53] |
CHOI J, CHANG H J, FISCHER T, et al. Context-aware deep feature compression for high-speed visual tracking[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 479-488.
|
[54] |
CHOI J, CHANG H J, YUN S, et al. Attentional correlation filter network for adaptive visual tracking[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4828-4837.
|
[55] |
DU D, QI Y, YU H, et al. The unmanned aerial vehicle benchmark: object detection and tracking[C]//The 15th European Conference on Computer Vision. Berlin: Springer, 2018: 375-391.
|
[56] |
MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for uav tracking[C]//The 14th European Conference on Computer Vision. Berlin: Springer, 2016: 445-461.
|
[57] |
LI S, YEUNG D Y. Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 2017: 4140-4146.
|
[58] |
ZHU P, WEN L, BIAN X, et al. Vision meets drones: a challenge[EB/OL]. [2018-05-18]. https://arxiv.org/pdf/1804.07437.
|
[59] |
ZHU P, WEN L, DU D, et al. Vision meets drones: past, present and future[EB/OL]. [2020-06-22]. https://arxiv.org/pdf/2001.06303.
|
[60] |
WU Y, LIM J, YANG M. Online object tracking: a benchmark[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 2411-2418.
|
[61] |
YUN S, CHOI J, YOO Y, et al. Action-decision networks for visual tracking with deep reinforcement learning[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1349-1358.
|
[62] |
SONG Y, MA C, GONG L, et al. Crest: convolutional residual learning for visual tracking[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2574-2583.
|
[63] |
WANG L, OUYANG W, WANG X, et al. Visual tracking with fully convolutional networks[C]//2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 3119-3127.
|
[64] |
QI Y, ZHANG S, QIN L, et al. Hedged deep tracking[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4303-4311.
|
[65] |
ZHANG T, XU C, YANG M. Learning multi-task correlation particle filters for visual tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(2): 365-378.
|
[66] |
FAN H, LING H. Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 5487-5495.
|
[67] |
TAO R, GAVVES E, SMEULDERS A W M. Siamese instance search for tracking[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1420-1429.
|
[68] |
DANELLJAN M, HÄGER G, KHAN F S, et al. Adaptive decontamination of the training set: a unified formulation for discriminative visual tracking[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1430-1438.
|
[69] |
MUELLER M, SMITH N, GHANEM B. Context-aware correlation filter tracking[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1387-1395.
|
[70] |
WANG L, OUYANG W, WANG X, et al. STCT: sequentially training convolutional networks for visual tracking[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1373-1381.
|
[71] |
HOWARD A G, ZHU M, CHEN B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2019-05-22]. https://arxiv.org/pdf/1704.04861.
|
[72] |
SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520.
|
[73] |
MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: practical guidelines for efficient cnn architecture design[C]//The 15th European Conference on Computer Vision. Berlin: Springer, 2018: 122-138.
|
[1] | JIA Kebin, WU Yueheng. Fast Partition Algorithm in Depth Map Intra Coding Unit Based on Attention-Residual Bi-feature Stream Convolutional Neural Network[J]. Journal of Beijing University of Technology. DOI: 10.11936/bjutxb2023080017 |
[2] | HAN Honggui, ZHEN Xiaoling, LI Fangyu, DU Yongping. Mobile Phone Surface Defects Recognition Method Based on Multi-scale Convolution Neural Networks[J]. Journal of Beijing University of Technology, 2023, 49(11): 1150-1158. DOI: 10.11936/bjutxb2022010021 |
[3] | FANG Hongyuan, MA Duo, WANG Niannian, HU Haobang, DONG Jiaxiu. Detection Algorithm for Multiple Underground Pipeline Diseases Based on a Fusion Convolutional Neural Network[J]. Journal of Beijing University of Technology, 2022, 48(6): 561-571. DOI: 10.11936/bjutxb2021070006 |
[4] | LI Xiuzhi, ZHANG Ran, JIA Songmin. Design of 3D Convolutional Neural Network for Action Recognition for Helping the Aged[J]. Journal of Beijing University of Technology, 2021, 47(6): 589-597. DOI: 10.11936/bjutxb2020040005 |
[5] | HAN Honggui, ZHEN Qi, REN Keyan, WU Xiaolong, DU Yongping, QIAO Junfei. Mobile Phone Model Recognition Method Based on Siamese Convolutional Neural Network[J]. Journal of Beijing University of Technology, 2021, 47(2): 112-119. DOI: 10.11936/bjutxb2019100016 |
[6] | LIN Lan, WU Yuchao, WANG Jingxuan, WU Shuicai. Research Progress of Semantic Segmentation Technique Based on Convolutional Neural Network and Its Application in Brain Neuroimage[J]. Journal of Beijing University of Technology, 2021, 47(1): 85-92. DOI: 10.11936/bjutxb2019070002 |
[7] | LI Jiangeng, LI Lijie, ZHANG Yan, WANG Pengfei, ZUO Guoyu. Method for Training Convolution Neural Network With Multiple Classifiers[J]. Journal of Beijing University of Technology, 2018, 44(10): 1291-1296. DOI: 10.11936/bjutxb2017040029 |
[8] | LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. Journal of Beijing University of Technology, 2017, 43(1): 22-27. DOI: 10.11936/bjutxb2016060070 |
[9] | SUN Yan-feng, QI Guang-lei, HU Yong-li, ZHAO Lu. Deep Convolution Neural Network Recognition Algorithm Based on Improved Fisher Criterion[J]. Journal of Beijing University of Technology, 2015, 41(6): 835-841. DOI: 10.11936/bjutxb2014110040 |
[10] | YIN Bao-cai, WANG Wen-tong, WANG Li-chun. Review of Deep Learning[J]. Journal of Beijing University of Technology, 2015, 41(1): 48-59. DOI: 10.11936/bjutxb2014100026 |
1. |
肖云,曹丹,李成龙,江波,汤进. 基于高空无人机平台的多模态跟踪数据集. 中国图象图形学报. 2025(02): 361-374 .
![]() | |
2. |
伍鹏,卢朝成,颜真,韩聪,康合敏,诸葛葳. 基于GPS定位技术的电力巡检目标实时跟踪平台设计. 导航与控制. 2022(02): 48-55 .
![]() |