Object Detection in Last Decade - A Survey*

Usama Arshad(1),


(1) Comsats University Islamabad

Abstract

Purpose: In the last decade, object detection is one of the interesting topics that played an important role in revolutionizing the present era. Especially when it comes to computer vision, object detection is a challenging and most fundamental problem. Researchers in the last decade enhanced object detection and made many advanced discoveries using technological advancements. Methods: This research work describes the advancements in object detection over the last 10 years (2010-2020). Different papers published in last 10 years related to object detection and its types are discussed with respect to their role in the advancement of object detection. Result: This research work also describes different types of object detection, which include text detection, face detection etc. It clearly describes the changes in object detection techniques over the period of last 10 years. Object detection is divided into two groups. General detection and Task-based detection. General detection is discussed chronologically and with its different variants while task-based detection includes many state-of-the-art algorithms and techniques according to tasks. This paper also described the basic comparison of how some algorithms and techniques have been updated and played a major role in advancements of different fields related to object detection. Novelty: This research concludes that the most important advancements that happened in last decade and future is promising much more advancement in object detection on the basis of work done in this decade.

Keywords

Computer Vision, Object Detection, Text Detection, Face Detection, YOLO, RCNN, Fast RCNN

Full Text:

PDF

References

R. Szeliski, Computer vision: algorithms and applications. Springer, 2010.

R. Sathya and A. Abraham, “Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification,†Int. J. Adv. Res. Artif. Intell., vol. 2, no. 2, pp. 34–38, 2013.

L. Chen, J. Hoey, C. D. Nugent, D. J. Cook, and Z. Yu, “Sensor-based activity recognition,†IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 6, pp. 790–808, 2012.

Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo, “Image captioning with semantic attention,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 4651–4659, 2016.

F. Ahmad, A. Najam, and Z. Ahmed, “Image-based Face Detection and Recognition: ‘State of the Art,’†Proc. IEEE Conf. Comput. Vis. pattern Recognit., pp. 3–6, 2013.

M. Mehta, C. Goyal, M. C. Srivastava, and R. C. Jain, “Real time object detection and tracking: Histogram matching and Kalman filter approach,†2010 2nd Int. Conf. Comput. Autom. Eng. ICCAE 2010, vol. 5, pp. 796–801, 2010.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014.

R. Girshick, “Fast R-CNN,†Proc. IEEE Int. Conf. Comput. Vis., pp. 1440–1448, 2015.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,†IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 779–788, 2016.

Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng, and J. Sun, “Light-head R-CNN: In defense of two-stage object detector,†arXiv, pp. 1–9, 2017.

M. Najibi, P. Samangouei, R. Chellappa, and L. S. Davis, “SSH: Single Stage Headless Face Detector,†Proc. IEEE Int. Conf. Comput. Vis., pp. 4885–4894, 2017.

J. G. Andrews et al., “What will 5G be?,†IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, 2014.

N. Dvornik, J. Mairal, and C. Schmid, “Modeling visual context is key to augmenting object detection datasets,†Proc. Eur. Conf. Comput. Vis., pp. 364–380, 2018.

T. Yi-Lin et al., “Microsoft COCO,†Eur. Conf. Comput. Vis., pp. 740–755, 2014.

Y. Xiang, R. Mottaghi, and S. Savarese, “Beyond PASCAL: A benchmark for 3D object detection in the wild,†IEEE Winter Conf. Appl. Comput. Vis., pp. 75–82, 2014.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks Alex,†Adv. Neural Inf. Process. Syst., pp. 1097–1105, 2012.

A. Kuznetsova et al., “The Open Images Dataset V4: Unified Image Classification, Object Detection, and Visual Relationship Detection at Scale,†Int. J. Comput. Vis., vol. 128, no. 7, pp. 1956–1981, 2020.

K. Choi and J. Yun, “Robust and Fast Moving Object Detection in A Non-Stationary Camera Via Foreground Probability Based Sampling,†IEEE Int. Conf. Image Process., pp. 4897–4901, 2015.

H. J. Yoo, “Deep Convolution Neural Networks in Computer Vision: a Review,†IEIE Trans. Smart Process. Comput., vol. 4, no. 1, pp. 35–43, 2015.

O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,†Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015.

J. Dai, Y. Li, K. He, and J. Sun, “R-fcn: Object detection via region-based fully convolutional networks,†Adv. Neural Inf. Process. Syst., pp. 379–387, 2016.

K. He, X. Zhang, S. Ren, and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,†IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1904–1916, 2015.

E. Jahani Heravi, H. Habibi Aghdam, and D. Puig, “An optimized convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods,†Pattern Recognit. Lett., vol. 105, pp. 50–58, 2018.

H. Li, P. Xiong, J. An, and L. Wang, “Dynamic attention network for semantic segmentation,†Neurocomputing, vol. 384, pp. 182–191, 2018.

K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,†IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, 2020.

W. S. Lai, J. Bin Huang, N. Ahuja, and M. H. Yang, “Deep laplacian pyramid networks for fast and accurate super-resolution,†Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-January, pp. 5835–5843, 2017.

T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,†Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-January, pp. 936–944, 2017.

Q. Zhao et al., “M2Det: A single-shot object detector based on multi-level feature pyramid network,†Thirty-Third AAAI Conf. Artif. Intell., 2018.

T. Kong, F. Sun, W. Huang, and H. Liu, “Deep Feature Pyramid Reconfiguration for Object Detection,†Eur. Conf. Comput. Vis., vol. 1, pp. 172–188, 2018.

V. Ruzicka and F. Franchetti, “Fast and accurate object detection in high resolution 4K and 8K video using GPUs,†IEEE High Perform. Extrem. Comput. Conf., 2018.

R. Huang, J. Pedoeem, and C. Chen, “YOLO-LITE: A Real-Time object detection algorithm optimized for non-GPU computers,†IEEE Int. Conf. Big Data, pp. 2503–2510, 2018.

J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,†Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-January, pp. 6517–6525, 2017.

J. Redmon and A. Farhadi, “YOLO v.3,†arXiv:1804.02767, pp. 1–6, 2018.

W. Liu, D. Anguelov, D. Erhan, and C. Szegedy, “SSD: Single Shot MultiBox Detector,†Eur. Conf. Comput. Vis., vol. 1, pp. 21–37, 2016.

W. Khan, N. Zaki, and L. Ali, “Intelligent Pneumonia Identification from Chest X-Rays: A Systematic Literature Review,†medRxiv, 2020.

Z. X. Li and F. Q. Zhou, “FSSD: Feature fusion single shot multibox detector,†arXiv:1712.00960, 2017.

L. Zheng, C. Fu, and Y. Zhao, “Extend the shallow part of single shot MultiBox detector via convolutional neural network,†arXiv:1801.05918, 2018.

R. Li and J. Yang, “Improved YOLOv2 Object Detection Model,†Int. Conf. Multimed. Comput. Syst. -Proceedings, vol. 2018-May, pp. 1–6, 2018.

E. Dong, Y. Zhu, Y. Ji, and S. Du, “An improved convolution neural network for object detection using Yolov2,†IEEE Int. Conf. Mechatronics Autom., pp. 1184–1188, 2018.

Y. C. Lim and M. Kang, “Object Detection Using a Single Extended Feature Map,†IEEE Intell. Veh. Symp. Proc., vol. 2018-June, no. Iv, pp. 820–825, 2018.

H. Nakahara, Y. Haruyoshi, T. Fujii, and S. Sato, “A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an FPGA,†in Proc. 2018 ACM/SIGDA Int. Symp. Field-Program. Gate Arrays, 2018, pp. 31–40.

B. Schölkopf and A. J. Smola, Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. The MIT Press, 2018.

W. Hammedi, M. Ramirez-Martinez, P. Brunet, and S.-M. Senouci, “Deep Learning-Based Real-Time Object Detection in Inland Navigation,†IEEE Glob. Commun. Conf., pp. 1–6, 2019.

Refbacks

  • There are currently no refbacks.




Scientific Journal of Informatics (SJI)
p-ISSN 2407-7658 | e-ISSN 2460-0040
Published By Department of Computer Science Universitas Negeri Semarang
Website: https://journal.unnes.ac.id/nju/index.php/sji
Email: [email protected]

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.