Evaluating Machine Reading Comprehension for Dative Alternation Phenomenon
Abstract
The ability for a computer to read and interpret natural language texts in order to answer a query
is still a work in progress. Recently, machine reading comprehension has grown in prominence as
deep learning has gained popularity and large scale datasets have become available. The goal of
this paper is to evaluate machine reading comprehension in natural language processing by presenting
the two BERT based models: RoBERTa and Minilm for evaluating machine reading comprehension,
which has the goal of capturing the performance of the models when encountering the
linguistic phenomenon, dative alternation. This dissertation explores the concept of detecting a
linguistic phenomenon by using machine learning techniques along with pre-trained deep learning
models that are pre-trained language models as well. Finally, the analysis scores of the models is
based on the complexity of the passages. There are three types of datasets that are divided into
two categories based on their linguistic dimension: syntactic complexity and lexical diversity. The
lexical dimension has six datasets: three simple datasets and three complex datasets. While the
syntactic complexity dimension has only three datasets: Simple dataset, Medium dataset, Complex
dataset. The pre-trained BERT models were fine-tuned on a total of 39 datasets. The results of the
experiments clarify that the Minilm model outperforms the performance of the RoBERTa model.
Minilm has outperformed RoBERTa with an accuracy of 0.706 and 0.943 F1 score on a sample of
54468 dative passages from the different 14 verbs sets. Further, we also point out the flaws in existing
machine reading comprehension datasets as well as future research possibilities.