Can the number of confirmed COVID-19 cases be predicted more accurately by including lifestyle data? An exploratory study for data-driven prediction of COVID-19 cases in metropolitan cities using deep learning models

Objective The COVID-19 outbreak has significantly impacted human lifestyles and life patterns. Therefore, data related to human social life may tell us the increase or decrease in the number of confirmed COVID-19 cases. However, although the number of confirmed cases is affected by social life, it i...

Full description

Saved in:
Bibliographic Details
Main Author: Sungwook Jung
Format: Article
Language:English
Published: SAGE Publishing 2025-01-01
Series:Digital Health
Online Access:https://doi.org/10.1177/20552076251314528
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective The COVID-19 outbreak has significantly impacted human lifestyles and life patterns. Therefore, data related to human social life may tell us the increase or decrease in the number of confirmed COVID-19 cases. However, although the number of confirmed cases is affected by social life, it is difficult to find studies that attempt to predict the number of confirmed cases using various lifestyle data. This paper attempted an exploratory data analysis to see if the number of confirmed cases could be predicted more accurately by including various lifestyle data. Methods We included taking public transportation, watching a movie at the cinema, and accommodation at a motel in the lifestyle data. Finally, a ‘lifestyle addition’ set was constructed that added lifestyle data to the number of past confirmed cases and search term frequency data. The deep learning algorithms used in the analysis are deep neural networks (DNNs) and recurrent neural networks (RNNs). Performance differences across data sets and between deep learning models were tested to be statistically significant. Results Among metropolitan cities in South Korea, Seoul (9.6 million) with the largest population and Busan (3.4 million) with the second largest population had the lowest error rate in ‘lifestyle addition’ set. When predicting with the ‘lifestyle addition’ set, in Seoul, the error rate was reduced to 20.1%, and in Busan, the graph of the actual number of confirmed cases and the predicted graph were almost identical. Conclusions Through this study, we were able to identify three notable results that could contribute to predicting the number of patients infected with epidemic in the future.
ISSN:2055-2076