Human Detection using Image Segmentation Technique on COCO Dataset

Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This project sought to compare two approaches for human detection in the COCO dataset custom using machine learning techniques. The project sought to measure the efficiency of deep learning approaches ( Detectron2 and Mask-R CNN) to recognise a human at a variety of distances, orientations and environments. The main objective was to develop a human image detection system using the provided visual information. Before applying them using Python, the study researched image segmentation algorithms, face, and hand detection. In this research, instance segmentation was used. Because the Person val2017 dataset is one of the most often used image segmentation datasets, it was chosen for the current study. One of the most active areas of image processing research is image segmentation and complex image processing activities such as image identification and detection must be completed. The Pytorch framework was used to train Detectron2 and Tensor flow for train Mask-R CNN. The project was divided into four parts which were designed and implemented to achieve the stated goal: human detection, image segmentation, measuring the efficiency of Mask-R CNN and Detectron2 for comparison and measuring the proposed system’s overall efficiency. Finally, the segmentation based evaluation results for both Mask-R CNN and Detectron2 are compared. The average precision of Mask-R CNN is found to be 36.1 percent, whereas the average precision of Detectron2 is 55.5 percent. Meanwhile, the average recall of Mask-R CNN is 61.2 percent and the average recall of Detectron2 is 80.4 percent. The overall accuracy of the system is calculated as being 85 percent. However, the accuracy can be improved if the model undergoes additional training. The outcome revealed that Mask-R CNN performs well but this would require more time to be spent on training, whereas Detectron2 would be trained in a relatively short time. Computer vision, deep learning, feature extraction, human detection, the COCO dataset, Mask R-CNN, Detectron2 and instance image segmentation terms were used in this project.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025