Recognizing Arabic Typed Text-based CAPTCHAs Using Deep Learning Algorithm
No Thumbnail Available
Date
2019
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Saudi Digital Library
Abstract
CAPTCHA (Completely Automated Public Turing test to tell Computer and Human Apart) is one of
the important security technologies which aimed to distinguish between a human and a robot. Arabic
CAPTCHA recognition refers to the identification of Arabic CAPTCHA characters that are typed textbased CAPTCHA. Arabic CAPTCHA recognition has emerged as a new research area in recent years
for the ease of access to Arabic websites. Feature extraction and accurate classification help in
achieving increased recognition accuracy. This project combines between cyber security and artificial
intelligence fields. The purpose of this project is to compare the efficiency of some classification
techniques in extracting distinctive features of all forms of Arabic text-based CAPTCHAs characters
and recognize them. That is, this project combines segmentation and recognition processes of Arabic
text-based CAPTCHAs. This project is an extension of the previous paper "Evaluating Robustness of
Arabic CAPTCHAs”. Since the recognition of Arabic text-based CAPTCHA characters could be done
using deep learning techniques after segmentation, this project determines the effectiveness of these
techniques in capturing useful information, and therefore, achieving more accurate recognition results.
In the first phase, we applied a vertical segmentation method on Arabic text-based CAPTCHA samples
to evaluate the robustness of these samples. In the second phase, we applied recognition process on
images that were segmented during the first phase by performing some of the deep learning methods
(like Convolutional Neural Network (CNN) and Multi-Layer Feed-Forward ANN) besides other
machine learning methods like Artificial Neural Networks (ANN), k-Nearest Neighbor (kNN), and
Support Vector Machines (SVM) to identify various characters in Arabic CAPTCHA. We have got the
best result of accuracy for two databases when applying CNN method on dataset#1 and ANN method
on dataset#2.