Cloud-Based Architecture for YOLOv3 Object Detector using gRPC and Protobuf

Eko Rudiawan Jamzuri(1), Hanjaya Mandala(2), Riska Analia(3), Susanto Susanto(4),

(1) Politeknik Negeri Batam
(2) National Taiwan Normal University
(3) Politeknik Negeri Batam
(4) Politeknik Negeri Batam


The deep learning-based object detector accuracy has surpassed conventional detection methods. Although implementation is still limited to hardware capabilities, this problem can be overcome by combining edge devices with cloud computing. The recent study of cloud-based object detector architecture is generally based on representational state transfer (RESTful web services), which uses a pooling system method for data exchange. As a result, this system leads to a low detection speed and cannot support real-time data streaming. Therefore, this study aims to enhance the detection speed in cloud-based object recognition systems using gRPC and Protobuf to support real-time detection. The proposed architecture was deployed on the Virtual Machine Instance (VMI) equipped with a Graphics Processing Unit (GPU). The gRPC server and YOLOv3 deep learning object detector were executed on the cloud server to handle detection requests from edge devices. Furthermore, the captured images from the edge devices were encoded into Protobuf format to reduce the message size delivered to the cloud server. The results showed that the proposed architecture improved detection speed performance on the client-side in the range of 0.27 FPS to 1.72 FPS compared to the state-of-the-art method. It was also observed that it could support multiple edge devices connection with slight performance degradation in the range of 1.78 FPS to 1.83 FPS, depending on the network interface used.


cloud computing; gRPC; object detection; Protobuf; YOLO

Full Text:



Z.-Q. Zhao, P. Zheng, S. Xu, and X. Wu, "Object Detection With Deep Learning: A Review," IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 11, pp. 3212–3232, Nov. 2019.

L. Aziz, M. S. Bin Haji Salam, U. U. Sheikh, and S. Ayub, "Exploring Deep Learning-Based Architecture, Strategies, Applications and Current Trends in Generic Object Detection: A Comprehensive Review," IEEE Access, vol. 8, pp. 170461–170495, 2020.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Region-Based Convolutional Networks for Accurate Object Detection and Segmentation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 1, pp. 142–158, Jan. 2016.

W. Liu et al., "SSD: Single Shot MultiBox Detector," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9905 LNCS, Springer Verlag, 2016, pp. 21–37.

T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal Loss for Dense Object Detection," IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, Feb. 2020.

M. Tan, R. Pang, and Q. V. Le, "EfficientDet: Scalable and Efficient Object Detection," in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp. 10778–10787.

H. Mao, S. Yao, T. Tang, B. Li, J. Yao, and Y. Wang, "Towards Real-Time Object Detection on Embedded Systems," IEEE Trans. Emerg. Top. Comput., vol. 6, no. 3, pp. 417–431, 2018.

D. Paulino, A. Reis, H. Paredes, H. Fernandes, and J. Barroso, "Usage of Artificial Vision Cloud Services as Building Blocks for Blind People Assistive Systems," Int. J. Recent Technol. Eng., vol. 8, no. 2S10, pp. 453–458, Oct. 2019.

K. Thammarak, P. Kongkla, Y. Sirisathitkul, and S. Intakosum, "Comparative analysis of Tesseract and Google Cloud Vision for Thai vehicle registration certificate," Int. J. Electr. Comput. Eng., vol. 12, no. 2, p. 1849, Apr. 2022.

I. K. G. Darma Putra, D. M. Sri Asra, I. G. N. Dwiva Hardijaya, I. G. G. Surya Prabawa, and I. M. A. Satia Widiatmika, “Medical vision: web and mobile medical image retrieval system based on google cloud vision,” Int. J. Electr. Comput. Eng., vol. 10, no. 6, p. 5974, Dec. 2020.

Y. Zeng and J. Zhang, "A machine learning model for detecting invasive ductal carcinoma with Google Cloud AutoML Vision," Comput. Biol. Med., vol. 122, p. 103861, Jul. 2020.

M. Nieto-Hidalgo, F. J. Ferrández-Pastor, R. J. Valdivieso-Sarabia, J. Mora-Pascual, and J. M. García-Chamizo, “Gait Analysis Using Computer Vision Based on Cloud Platform and Mobile Device,” Mob. Inf. Syst., vol. 2018, pp. 1–10, 2018.

S. Khan, A. Akram, and N. Usman, "Real Time Automatic Attendance System for Face Recognition Using Face API and OpenCV," Wirel. Pers. Commun., vol. 113, no. 1, pp. 469–480, Jul. 2020.

H. M. Gan, S. Fernando, and M. Molina-Solana, "Scalable object detection pipeline for traffic cameras: Application to Tfl JamCams," Expert Syst. Appl., vol. 182, p. 115154, Nov. 2021.

K. A. Jadhav, "Building and hosting a computer vision api on aws using an ec2 instance," Int. J. Sci. Technol. Res., vol. 8, no. 12, pp. 857–862, 2019.

S. Ding, L. Li, Z. Li, H. Wang, and Y. Zhang, "Smart electronic gastroscope system using a cloud–edge collaborative framework," Futur. Gener. Comput. Syst., vol. 100, pp. 395–407, Nov. 2019.

J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 7263–7271.

H. Chen, Z. He, B. Shi, and T. Zhong, "Research on Recognition Method of Electrical Components Based on YOLO V3," IEEE Access, vol. 7, pp. 157818–157829, 2019.

J.-Y. Wu, C. Yu, S.-W. Fu, C.-T. Liu, S.-Y. Chien, and Y. Tsao, "Increasing Compactness of Deep Learning Based Speech Enhancement Models With Parameter Pruning and Quantization Techniques," IEEE Signal Process. Lett., vol. 26, no. 12, pp. 1887–1891, Dec. 2019.

Y. J. Wai, Z. B. M. Yussof, and S. I. bin Md Salim, "Hardware Implementation and Quantization of Tiny-Yolo-v2 using Open CL," Int. J. Recent Technol. Eng., vol. 8, no. 2S6, pp. 808–813, Sep. 2019.

M. A. Farooq, W. Shariff, and P. Corcoran, "Evaluation of Thermal Imaging on Embedded GPU Platforms for Application in Vehicular Assistance Systems," IEEE Trans. Intell. Veh., pp. 1–1, 2022.

X. Li, B. He, K. Ding, W. Guo, B. Huang, and L. Wu, "Wide-Area and Real-Time Object Search System of UAV," Remote Sens., vol. 14, no. 5, p. 1234, Mar. 2022.

H. Feng, G. Mu, S. Zhong, P. Zhang, and T. Yuan, "Benchmark Analysis of YOLO Performance on Edge Intelligence Devices," Cryptography, vol. 6, no. 2, p. 16, Apr. 2022.

D.-J. Shin and J.-J. Kim, "A Deep Learning Framework Performance Evaluation to Use YOLO in Nvidia Jetson Platform," Appl. Sci., vol. 12, no. 8, p. 3734, Apr. 2022.

S. Kiraly and S. Szekely, "Analysing RPC and Testing the Performance of Solutions," Informatica, vol. 42, no. 4, pp. 555–561, Sep. 2018.

D. P. Proos and N. Carlsson, "Performance Comparison of Messaging Protocols and Serialization Formats for Digital Twins in IoV," in IFIP Networking 2020 Conference and Workshops, Networking 2020, 2020, pp. 10–18. Accessed: Jul. 08, 2021. [Online]. Available:


  • There are currently no refbacks.