An Empirical Study of Programming Languages Impact on Open-Source Software Projects based on Mining Software Repositories
Abstract
There are dozens of programming languages in use today, and new languages and language features are being introduced frequently. However, there are only a few empirical studies on the usage and practice of programming languages. In this research we explored languages from an empirical/pragmatic perspective to address their association with open-source software (OSS) projects and practices. The research was conducted in a comparative setting to investigate whether a significant association exists. That is, a comparison was made between languages both individually and in groups to understand similarities and examine differences, if any, in popularity and user adoption, feature usage, and OSS project attributes. The methodology was based on mining software repositories, and the results obtained from an analysis of possibly the largest open-source dataset (a sample of 5,350 projects from a total of 15,000 projects), where a main language was identified. The investigation revealed that a considerable association exists; however, the effect size of such association was modest. When accounting for confounding factors such as project size and type, the findings held only in a small number of the tested cases. Thus, the choice of language has a limited effect on OSS development.