When a Few Words Are Not Enough: Improving Text Classification Through Contextual Information
Abstract
Traditional text classification approaches may be ineffective when applied to texts with insufficient or limited number of words due to brevity of text and sparsity of feature space.
The lack of contextual information can make texts ambiguous; hence, text classification approaches relying solely on words may not properly capture the critical features of a real-world
problem. One of the popular approaches to overcoming this problem is to enrich texts with
additional domain-specific features. Thus, this thesis shows how it can be done in two real world problems in which text information alone is insufficient for classification. While one
problem is depression detection based on the automatic analysis of clinical interviews, another problem is detecting fake online news
