Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost

Total organic carbon (TOC) content is an important parameter for evaluating the abundance of organic matter in, and the hydrocarbon production capacity, of shale. Currently, no prediction method is applicable to all geological conditions, so exploring an efficient and accurate prediction method suit...

Full description

Saved in:
Bibliographic Details
Main Authors: Yingjie Meng, Chengwu Xu, Tingting Li, Tianyong Liu, Lu Tang, Jinyou Zhang
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/7/3447
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Total organic carbon (TOC) content is an important parameter for evaluating the abundance of organic matter in, and the hydrocarbon production capacity, of shale. Currently, no prediction method is applicable to all geological conditions, so exploring an efficient and accurate prediction method suitable for the study area is of great significance. In this study, for the shale of the Qingshankou Formation of the Gulong Sag in the Songliao Basin, TOC content prediction models using various machine learning algorithms are established and compared based on measured data, principal component analysis, and the particle swarm optimization algorithm. The results showed that GR, AC, DEN, CNL, LLS, and LLD are the most sensitive parameters using the Pearson correlation coefficient. The four principal components were also identified as input features through PCA processing. The XGBoost prediction model, established after selecting the parameters through PSO intelligence, had the highest accuracy with an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></semantics></math></inline-formula> and RMSE of 0.90 and 0.1545, respectively, which are superior to the values of the other models. This model is suitable for the prediction of TOC content and provides effective technical support for shale oil exploration and development in the study area.
ISSN:2076-3417