A New Framework for Classifying University-Industry Collaboration Using Synthetic Minority Oversampling Technique and Stacking Ensemble
University-industry collaboration has emerged as a critical driver of innovation and economic growth. However, predicting the outcomes of these collaborations remains methodologically challenging. Conventional statistical methods fail to capture non-linear relationships in collaboration data. While...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11028081/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | University-industry collaboration has emerged as a critical driver of innovation and economic growth. However, predicting the outcomes of these collaborations remains methodologically challenging. Conventional statistical methods fail to capture non-linear relationships in collaboration data. While single machine learning models are increasingly adopted, they are often limited by class imbalance. This paper proposes a framework that integrates stacking ensemble and synthetic minority over-sampling technique (SMOTE) for classifying university-industry collaboration into different performance classes (low, high and medium). This approach leverages the collective merits of multiple single models, while mitigating class imbalance, to deliver optimal classification results. Our framework combines outputs of random forest, J48 and support vector machine base models, with the multilayer perceptron making final classifications. Utilizing the global competitiveness index dataset, we analyze four key predictive dimensions; higher learning and research institutes, government involvement, resources and absorptive capacity. Experimental results reveal the stacking ensemble-SMOTE300% model achieves superior accuracy (92.6%), surpassing other ensemble configurations (stacking ensemble-SMOTE200%: 90.3%, stacking ensemble-SMOTE100%: 88.2%) and single models (random forest: 87.7%, neural networks: 84.6%, J48: 84.2%, and support vector machines: 83.8%). Notably, stacking ensemble-300% reduces misclassifications of the minority class (high-performing collaborations) to 1.9%, a substantial improvement over random forest (13.9%). Thus, effectively resolving class imbalance. This enhancement comes with additional computational costs, a necessary trade-off for predictive benefits. Beyond methodological advancements, our framework identifies university-industry collaboration success factors, offering stakeholders a data-driven tool to inform resource allocation and targeted mitigation strategies. These advancements offer actionable insights to enhance collaboration success across various national contexts. |
|---|---|
| ISSN: | 2169-3536 |