Accurate and Scalable Object Detection for Smart Warehousing Using YOLOv8 and Residual CNNs

Iman Elawady; Abdulrahman Al Homsi; Omer Ahmed Mohamed Ahmed

Open Access

Accurate and Scalable Object Detection for Smart Warehousing Using YOLOv8 and Residual CNNs

Iman Elawady¹^*, Abdulrahman Al Homsi², Omer Ahmed Mohamed Ahmed³
¹Department of Electrical Electronic Engineering/Karabuk University, Karabuk, Turkiye
²Department of Electrical Electronic Engineering/Karabuk University, Karabuk, Turkiye
³Department of Electrical Electronic Engineering/Karabuk University, Karabuk, Turkiye
* Corresponding author: imanelawdy@karabuk.edu.tr

Presented at the International Symposium on AI-Driven Engineering Systems (ISADES2025), Tokat, Turkiye, Jun 19, 2025

SETSCI Conference Proceedings, 2025, 22, Page (s): 128-131 , https://doi.org/10.36287/setsci.22.51.001

Published Date: 10 July 2025

Robust object detection and classification in dynamic industrial scenarios is still a challenge, in particular if the system has to work in real-time on mid-ranged hardware. Classical vision-based methods have difficulties to cope with low/weak illumination, reflective surfaces and partial occlusions, which are a common occurrence in manufacturing and warehouse settings. These restrictions result in counting mistakes, detecting misses and inaccurate automation results. Current one-stage detectors, such as YOLO, can provide real-time processing, but sometimes lose classification accuracy between object classes which are close from a visual point of view. In this paper we present a mixed deep learning model of YOLOv8 for object localization and a ResNet-18 convoluted neural network for classification enhancement. The system operates on eleven typical industrial object classes and is designed to run at 30 frames per second on hardware equipped with RTX 3050 GPU and 16GB of RAM. It comprises adaptive preprocessing for light normalization and occlusion handling, and it combines detection and classification results through a weighted scoring scheme. Training occurred on a 10 000-image dataset with class balancing and synthetic augmentation. The system obtained 78.3\% mean average precision (mAP@0. 5) with 89.2% accuracy for detection and for classification. Reflective surfaces had the overall amount of false-negatives reduced by 58%. A GUI written in PyQt5 for real-time monitoring and control. Field validation revealed up to 40% less total counting error compared to the manual method, indicating that our system can be further developed toward an industrial application.

Keywords - YOLOv8, CNN, real-time object detection, industrial automation, deep learning

[1] C. Wang, G. Jocher, J. Chaurasia, and L. Nazábal, *Ultralytics YOLOv8 Documentation*, 2022. [Online]. Available: https://docs.ultralytics.com

[2] T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in *Proc. IEEE Int. Conf. Comput. Vis. (ICCV)*, Venice, Italy, 2017, pp. 2980–2988.

[3] Intel Corporation, *OpenVINO Toolkit Documentation*, 2023. [Online]. Available: https://docs.openvino.ai

[4] C. Wang, H. Zhang, J. Lu, and K. Wang, “Scalability in computer vision for manufacturing,” *IEEE Trans. Ind. Informat.*, vol. 17, no. 9, pp. 6302–6311, Sept. 2021.

[5] A. Bochkovskiy, C. Wang, and H. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv:2004.10934, Apr. 2020. [Online]. Available: https://arxiv.org/abs/2004.10934

[6] C. Yang and K. Huang, *Deep Learning in Industrial Applications*. Cham, Switzerland: Springer, 2021.

[7] C. M. Bishop, *Deep Learning: Foundations and Concept*, 2nd ed. London, U.K.: Springer, 2023.

[8] M. Mohri, A. Rostamizadeh, and A. Talwalkar, *Foundations of Machine Learning*, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

[9] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. *Advances in Neural Information Processing Systems*, 25.

[10] Goodfellow, I., Bengio, Y., & Courville, A. (2016). *Deep learning*. MIT Press.

0
Citations (Crossref)

Click for
Google Scholar
Citations

8.1K
Total Views

183
Total Downloads

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

RIS

EndNote