A New Framework for Classifying University-Industry Collaboration Using Synthetic Minority Oversampling Technique and Stacking Ensemble

University-industry collaboration has emerged as a critical driver of innovation and economic growth. However, predicting the outcomes of these collaborations remains methodologically challenging. Conventional statistical methods fail to capture non-linear relationships in collaboration data. While...

Full description

Saved in:
Bibliographic Details
Main Authors: Uzapi Hange, Ezenwa Chike Nwanesi, Monhesea Obrey Patrick Bah
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11028081/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:University-industry collaboration has emerged as a critical driver of innovation and economic growth. However, predicting the outcomes of these collaborations remains methodologically challenging. Conventional statistical methods fail to capture non-linear relationships in collaboration data. While single machine learning models are increasingly adopted, they are often limited by class imbalance. This paper proposes a framework that integrates stacking ensemble and synthetic minority over-sampling technique (SMOTE) for classifying university-industry collaboration into different performance classes (low, high and medium). This approach leverages the collective merits of multiple single models, while mitigating class imbalance, to deliver optimal classification results. Our framework combines outputs of random forest, J48 and support vector machine base models, with the multilayer perceptron making final classifications. Utilizing the global competitiveness index dataset, we analyze four key predictive dimensions; higher learning and research institutes, government involvement, resources and absorptive capacity. Experimental results reveal the stacking ensemble-SMOTE300% model achieves superior accuracy (92.6%), surpassing other ensemble configurations (stacking ensemble-SMOTE200%: 90.3%, stacking ensemble-SMOTE100%: 88.2%) and single models (random forest: 87.7%, neural networks: 84.6%, J48: 84.2%, and support vector machines: 83.8%). Notably, stacking ensemble-300% reduces misclassifications of the minority class (high-performing collaborations) to 1.9%, a substantial improvement over random forest (13.9%). Thus, effectively resolving class imbalance. This enhancement comes with additional computational costs, a necessary trade-off for predictive benefits. Beyond methodological advancements, our framework identifies university-industry collaboration success factors, offering stakeholders a data-driven tool to inform resource allocation and targeted mitigation strategies. These advancements offer actionable insights to enhance collaboration success across various national contexts.
ISSN:2169-3536