COSMIC-Based Early Software Size Estimation Using Deep Learning and Domain-Specific BERT

Although the Common Software Measurement International Consortium (COSMIC) has a substantial role in accurate and early size measure, it needs structured processes and strict specification of requirements. So, using COSMIC for software size measurement is challenging and demands automation for effec...

Full description

Saved in:
Bibliographic Details
Main Authors: Yohannes Sefane Molla, Esubalew Alemneh, Samuel Temesgen Yimer
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10879224/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Although the Common Software Measurement International Consortium (COSMIC) has a substantial role in accurate and early size measure, it needs structured processes and strict specification of requirements. So, using COSMIC for software size measurement is challenging and demands automation for effective and efficient measurements. In this study, we proposed an automated COSMIC-based early software size estimation approach using a domain-specific BERT called RE-BERT and deep learning algorithms. Two major experiments were employed in the research: the first was further pretraining of the generic BERT model over requirement engineering domain texts to produce a domain-specific pre-trained model called RE-BERT, while the second experiment was developing a deep learning regressor model for COSMIC size estimations using RE-BERT. Multi-Layer Perceptron (MLP) and the BERT Regressors are used to train the size estimation models. The experimental results showed that RE-BERT MLP provides 0.691 MAE and 0.988 MSE, which is better among other regression models (BASE-BERT MLP, RE-BERT regressor, and BASE-BERT regressor). On average, RE-BERT-based regressor models provided 1.23% to 3.19% improvement over BASE-BERT regressor models. This shows that domain-specific pre-trained models have a good effect on the performance of deep learning models. This research also ensures that BERT is good in regressions tasks like its counterpart classification.
ISSN:2169-3536