Human Detection using Image Segmentation Technique on COCO Dataset
Abstract
This project sought to compare two approaches for human detection in the COCO dataset
custom using machine learning techniques. The project sought to measure the efficiency of
deep learning approaches ( Detectron2 and Mask-R CNN) to recognise a human at a variety
of distances, orientations and environments. The main objective was to develop a human
image detection system using the provided visual information. Before applying them using
Python, the study researched image segmentation algorithms, face, and hand detection. In
this research, instance segmentation was used.
Because the Person val2017 dataset is one of the most often used image segmentation datasets,
it was chosen for the current study. One of the most active areas of image processing research
is image segmentation and complex image processing activities such as image identification
and detection must be completed. The Pytorch framework was used to train Detectron2 and
Tensor flow for train Mask-R CNN.
The project was divided into four parts which were designed and implemented to achieve
the stated goal: human detection, image segmentation, measuring the efficiency of Mask-R
CNN and Detectron2 for comparison and measuring the proposed system’s overall efficiency.
Finally, the segmentation based evaluation results for both Mask-R CNN and Detectron2
are compared. The average precision of Mask-R CNN is found to be 36.1 percent, whereas
the average precision of Detectron2 is 55.5 percent. Meanwhile, the average recall of Mask-R
CNN is 61.2 percent and the average recall of Detectron2 is 80.4 percent. The overall accuracy
of the system is calculated as being 85 percent. However, the accuracy can be improved if
the model undergoes additional training.
The outcome revealed that Mask-R CNN performs well but this would require more time
to be spent on training, whereas Detectron2 would be trained in a relatively short time.
Computer vision, deep learning, feature extraction, human detection, the COCO dataset,
Mask R-CNN, Detectron2 and instance image segmentation terms were used in this project.