Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China

Machine learning models are gradually replacing traditional techniques used for landslide susceptibility assessment. This study aims to comprehensively compare multiple models, including linear, nonlinear, and ensemble models, based on 5281 historical landslides in southwest China, the area most sev...

Full description

Saved in:

Bibliographic Details
Main Authors:	Bingwei Wang, Qigen Lin, Tong Jiang, Huaxiang Yin, Jian Zhou, Jinhao Sun, Dongfang Wang, Ran Dai
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2023-12-01
Series:	Geocarto International
Subjects:	Evaluation of machine learning models cross-validation landslide susceptibility assessment southwest China
Online Access:	https://www.tandfonline.com/doi/10.1080/10106049.2022.2152493
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832539795797049344
author	Bingwei Wang Qigen Lin Tong Jiang Huaxiang Yin Jian Zhou Jinhao Sun Dongfang Wang Ran Dai
author_facet	Bingwei Wang Qigen Lin Tong Jiang Huaxiang Yin Jian Zhou Jinhao Sun Dongfang Wang Ran Dai
author_sort	Bingwei Wang
collection	DOAJ
description	Machine learning models are gradually replacing traditional techniques used for landslide susceptibility assessment. This study aims to comprehensively compare multiple models, including linear, nonlinear, and ensemble models, based on 5281 historical landslides in southwest China, the area most severely affected by the landslide disaster. Linear models represented by logistic regression (LR), nonlinear models represented by support vector machine (SVM), artificial neural network (ANN) and classification 5.0 decision tree (C5.0 DT), and ensemble models represented by random forest (RF) and categorical boosting (Catboost) were selected. The correlation coefficient, variance inflation factor (VIF), and relative important analysis were used to select the dominate landslide conditioning factors. Using multiple statistical indicators (e.g. Area Under the Receiver Operating Characteristic curve (AUC) and Kappa), cross-validation and qualitative methods to evaluate the models’ performance. The findings are: (1) Regarding the model predictive performance, the best predictive performance was demonstrated by the ensemble models Catboost (AUC = 0.823 and Kappa = 0.593) and RF (AUC = 0.821 and Kappa = 0.582), followed by the nonlinear models SVM (AUC = 0.775 and Kappa = 0.520), ANN (AUC = 0.770 and Kappa = 0.486) and C5.0 DT (AUC = 0.751 and Kappa = 0.497), while the linear model LR (AUC = 0.756 and Kappa = 0.456) had a more limited performance. The ensemble model, which uses a tree as its baseline classifier, has a lot of potential for studies into the landslide susceptibility. (2) Regarding the model robustness, the three types of models in nonspatial cross-validation (CV) performed relatively similarly in terms of predictive power, while in spatial cross-validation (SPCV), the linear model LR (median AUC = 0.714) achieved better results than the ensemble and nonlinear models. It implies that when the distribution of landslides is not homogeneous, linear models may be the most robust. It is advisable to consider various evaluation metrics from different perspectives and integrate them with specialist qualitative geomorphological empirical knowledge to determine the best model. (3) The Gini index-based RF model suggests that road density was the dominant factor in the frequency of landslides in the study area.
format	Article
id	doaj-art-06052c24e50546edb2ce66353cd1ce03
institution	Kabale University
issn	1010-6049 1752-0762
language	English
publishDate	2023-12-01
publisher	Taylor & Francis Group
record_format	Article
series	Geocarto International
spelling	doaj-art-06052c24e50546edb2ce66353cd1ce032025-02-05T08:30:30ZengTaylor & Francis GroupGeocarto International1010-60491752-07622023-12-0138110.1080/10106049.2022.2152493Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest ChinaBingwei Wang0Qigen Lin1Tong Jiang2Huaxiang Yin3Jian Zhou4Jinhao Sun5Dongfang Wang6Ran Dai7Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Institute for Disaster Risk Management/School of Geographical Sciences, Nanjing University of Information Science & Technology, Nanjing, ChinaMachine learning models are gradually replacing traditional techniques used for landslide susceptibility assessment. This study aims to comprehensively compare multiple models, including linear, nonlinear, and ensemble models, based on 5281 historical landslides in southwest China, the area most severely affected by the landslide disaster. Linear models represented by logistic regression (LR), nonlinear models represented by support vector machine (SVM), artificial neural network (ANN) and classification 5.0 decision tree (C5.0 DT), and ensemble models represented by random forest (RF) and categorical boosting (Catboost) were selected. The correlation coefficient, variance inflation factor (VIF), and relative important analysis were used to select the dominate landslide conditioning factors. Using multiple statistical indicators (e.g. Area Under the Receiver Operating Characteristic curve (AUC) and Kappa), cross-validation and qualitative methods to evaluate the models’ performance. The findings are: (1) Regarding the model predictive performance, the best predictive performance was demonstrated by the ensemble models Catboost (AUC = 0.823 and Kappa = 0.593) and RF (AUC = 0.821 and Kappa = 0.582), followed by the nonlinear models SVM (AUC = 0.775 and Kappa = 0.520), ANN (AUC = 0.770 and Kappa = 0.486) and C5.0 DT (AUC = 0.751 and Kappa = 0.497), while the linear model LR (AUC = 0.756 and Kappa = 0.456) had a more limited performance. The ensemble model, which uses a tree as its baseline classifier, has a lot of potential for studies into the landslide susceptibility. (2) Regarding the model robustness, the three types of models in nonspatial cross-validation (CV) performed relatively similarly in terms of predictive power, while in spatial cross-validation (SPCV), the linear model LR (median AUC = 0.714) achieved better results than the ensemble and nonlinear models. It implies that when the distribution of landslides is not homogeneous, linear models may be the most robust. It is advisable to consider various evaluation metrics from different perspectives and integrate them with specialist qualitative geomorphological empirical knowledge to determine the best model. (3) The Gini index-based RF model suggests that road density was the dominant factor in the frequency of landslides in the study area.https://www.tandfonline.com/doi/10.1080/10106049.2022.2152493Evaluation of machine learning modelscross-validationlandslide susceptibility assessmentsouthwest China
spellingShingle	Bingwei Wang Qigen Lin Tong Jiang Huaxiang Yin Jian Zhou Jinhao Sun Dongfang Wang Ran Dai Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China Geocarto International Evaluation of machine learning models cross-validation landslide susceptibility assessment southwest China
title	Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China
title_full	Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China
title_fullStr	Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China
title_full_unstemmed	Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China
title_short	Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China
title_sort	evaluation of linear nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest china
topic	Evaluation of machine learning models cross-validation landslide susceptibility assessment southwest China
url	https://www.tandfonline.com/doi/10.1080/10106049.2022.2152493
work_keys_str_mv	AT bingweiwang evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT qigenlin evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT tongjiang evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT huaxiangyin evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT jianzhou evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT jinhaosun evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT dongfangwang evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina AT randai evaluationoflinearnonlinearandensemblemachinelearningmodelsforlandslidesusceptibilityassessmentinsouthwestchina

Evaluation of linear, nonlinear and ensemble machine learning models for landslide susceptibility assessment in southwest China

Similar Items