University-industry collaboration and open source software (OSS) dataset in mining software repositories (MSR) research

University-industry collaboration and open source software (OSS) dataset in mining software repositories (MSR) research Mining Software Repositories (MSR) is an applied and practise-oriented field aimed at solving real problems encountered by practitioners and bringing value to Industry. We believe that empirical studies on both Open Source Software (OSS) and Closed or Proprietary Source (CSS/PSS) is required in MSR research to increase generalizability or transferability of findings and reduce external (or threats) validity concerns. Furthermore, we believe that a collaboration between University and Industry is must or important in achieving the stated goals and agenda of MSR research (such as deployment and technology transfer). We analyse past five years of research papers published in MSR series of conferences (2010-2014) and count the number of studies using solely OSS data or solely CSS data or both OSS and CSS data. We also count the number of papers published by authors solely from Universities, solely from Industry and from both University and Industry. We present our findings which indicate lack of University-Industry collaboration (measured using co-authorship in scientific publications) and paucity of empirical studies on CSS/PSS data. Our analysis reveals that out of 187 studies over a period of 5 years, 90:9% studies are conducted solely on OSS dataset. We present our findings which indicate that only 14:43% of the studies involve a University-Industry collaboration.