Analyzing Key Features of Open Source Software Survivability with Random Forest

Open source software (OSS) projects rely on voluntary contributions, but their long-term survivability depends on sustained community engagement and effective problem-solving. Survivability, critical for maintaining project quality and trustworthiness, is closely linked to issue activity, as unresol...

Full description

Saved in:
Bibliographic Details
Main Authors: Sohee Park, Gihwon Kwon
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/2/946
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832589216939245568
author Sohee Park
Gihwon Kwon
author_facet Sohee Park
Gihwon Kwon
author_sort Sohee Park
collection DOAJ
description Open source software (OSS) projects rely on voluntary contributions, but their long-term survivability depends on sustained community engagement and effective problem-solving. Survivability, critical for maintaining project quality and trustworthiness, is closely linked to issue activity, as unresolved issues reflect a decline in maintenance capacity and problem-solving ability. Thus, analyzing issue retention rates provides valuable insights into a project’s health. This study evaluates OSS survivability by identifying the features that influence issue activity and analyzing their relationships with survivability. Kaplan–Meier survival analysis is employed to quantify issue activity and visualize trends in unresolved issue rates, providing a measure of project maintenance dynamics. A random forest model is used to examine the relationships between project features—such as popularity metrics, community engagement, code complexity, and project age—and issue retention rates. The results show that stars significantly reduce issue retention rates, with rates dropping from 0.62 to 0.52 as stars increase to 4000, while larger codebases, higher cyclomatic complexity, and older project age are associated with unresolved issue rates, rising by up to 15%. Forks also have a nonlinear impact, initially stabilizing retention rates but increasing unresolved issues as contributions became unmanageable. By identifying these critical factors and quantifying their impacts, this research offers actionable insights for OSS project managers to enhance project survivability and address key maintenance challenges, ensuring sustainable long-term success.
format Article
id doaj-art-c724759d628641a99f5981ebc41d222e
institution Kabale University
issn 2076-3417
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-c724759d628641a99f5981ebc41d222e2025-01-24T13:21:26ZengMDPI AGApplied Sciences2076-34172025-01-0115294610.3390/app15020946Analyzing Key Features of Open Source Software Survivability with Random ForestSohee Park0Gihwon Kwon1Department of SW Safety and Cyber Security, Kyonggi University, Suwon-si 16227, Gyeonggi-do, Republic of KoreaDepartment of SW Safety and Cyber Security, Kyonggi University, Suwon-si 16227, Gyeonggi-do, Republic of KoreaOpen source software (OSS) projects rely on voluntary contributions, but their long-term survivability depends on sustained community engagement and effective problem-solving. Survivability, critical for maintaining project quality and trustworthiness, is closely linked to issue activity, as unresolved issues reflect a decline in maintenance capacity and problem-solving ability. Thus, analyzing issue retention rates provides valuable insights into a project’s health. This study evaluates OSS survivability by identifying the features that influence issue activity and analyzing their relationships with survivability. Kaplan–Meier survival analysis is employed to quantify issue activity and visualize trends in unresolved issue rates, providing a measure of project maintenance dynamics. A random forest model is used to examine the relationships between project features—such as popularity metrics, community engagement, code complexity, and project age—and issue retention rates. The results show that stars significantly reduce issue retention rates, with rates dropping from 0.62 to 0.52 as stars increase to 4000, while larger codebases, higher cyclomatic complexity, and older project age are associated with unresolved issue rates, rising by up to 15%. Forks also have a nonlinear impact, initially stabilizing retention rates but increasing unresolved issues as contributions became unmanageable. By identifying these critical factors and quantifying their impacts, this research offers actionable insights for OSS project managers to enhance project survivability and address key maintenance challenges, ensuring sustainable long-term success.https://www.mdpi.com/2076-3417/15/2/946Kaplan–Meier survival functionopen source softwarerandom forestsoftware maintenancesurvivability
spellingShingle Sohee Park
Gihwon Kwon
Analyzing Key Features of Open Source Software Survivability with Random Forest
Applied Sciences
Kaplan–Meier survival function
open source software
random forest
software maintenance
survivability
title Analyzing Key Features of Open Source Software Survivability with Random Forest
title_full Analyzing Key Features of Open Source Software Survivability with Random Forest
title_fullStr Analyzing Key Features of Open Source Software Survivability with Random Forest
title_full_unstemmed Analyzing Key Features of Open Source Software Survivability with Random Forest
title_short Analyzing Key Features of Open Source Software Survivability with Random Forest
title_sort analyzing key features of open source software survivability with random forest
topic Kaplan–Meier survival function
open source software
random forest
software maintenance
survivability
url https://www.mdpi.com/2076-3417/15/2/946
work_keys_str_mv AT soheepark analyzingkeyfeaturesofopensourcesoftwaresurvivabilitywithrandomforest
AT gihwonkwon analyzingkeyfeaturesofopensourcesoftwaresurvivabilitywithrandomforest