PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION

This paper aims to develop a machine learning-based model for oil production rate prediction. The significance of feature dimension reduction is addressed by applying well-established approaches like Principal Component Analysis (PCA) and the proposed physics-driven feature creation technique. The p...

Full description

Saved in:
Bibliographic Details
Main Authors: Eghbal Motaei, Seyed Mehdi Tabatabai, Tarek Ganat, Ahmad Khanifar, Sulaiman Dzaiy, Timur Chis
Format: Article
Language:English
Published: Petroleum-Gas University of Ploiesti 2024-12-01
Series:Romanian Journal of Petroleum & Gas Technology
Subjects:
Online Access:http://jpgt.upg-ploiesti.ro/wp-content/uploads/2024/12/22_RJPGT_no.2-2024-Physics-driven-feature-ML-models-performance-oil-production-prediction.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832574284161089536
author Eghbal Motaei
Seyed Mehdi Tabatabai
Tarek Ganat
Ahmad Khanifar
Sulaiman Dzaiy
Timur Chis
author_facet Eghbal Motaei
Seyed Mehdi Tabatabai
Tarek Ganat
Ahmad Khanifar
Sulaiman Dzaiy
Timur Chis
author_sort Eghbal Motaei
collection DOAJ
description This paper aims to develop a machine learning-based model for oil production rate prediction. The significance of feature dimension reduction is addressed by applying well-established approaches like Principal Component Analysis (PCA) and the proposed physics-driven feature creation technique. The physics-driven features, derived from experience or analytical modeling, introduce physical relevance and improve model quality. The study focuses on oil production prediction using a dataset that includes reservoir permeability, wellbore skin, reservoir pressure, net pay thickness, water cut, and well-liquid production rate. Several machine learning techniques, such as SVM, k-NN, Decision Tree, Random Forest, and linear regression, were constructed using PCA feature selection. The models were tuned and validated using k-fold cross-validation. The same models were then built using physics-driven features, and their performance metrics were compared. The results show significant improvement when applying the proposed physics-driven feature creation, compared to PCA. Over 10-fold cross-validation, PCA improved the R² performance metric by 10% (from 70% to 77%), while physics-driven features increased it by 20% (from 70% to 90% on average). The Random Forest and linear regression models outperformed the others, particularly when built based on physics-driven features. Additionally, models based on physics-driven features exhibited less sensitivity to data splits for learning and testing, proving more reliable with better performance metrics compared to those using original features.
format Article
id doaj-art-33ceada318184734b0d8d418a31e48a0
institution Kabale University
issn 2734-5319
2972-0370
language English
publishDate 2024-12-01
publisher Petroleum-Gas University of Ploiesti
record_format Article
series Romanian Journal of Petroleum & Gas Technology
spelling doaj-art-33ceada318184734b0d8d418a31e48a02025-02-01T20:44:04ZengPetroleum-Gas University of PloiestiRomanian Journal of Petroleum & Gas Technology2734-53192972-03702024-12-015229130610.51865/JPGT.2024.02.22PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTIONEghbal Motaei0https://orcid.org/0000-0002-8191-2604Seyed Mehdi Tabatabai1https://orcid.org/0000-0001-5948-4343Tarek Ganat2https://orcid.org/0009-0000-7963-0264Ahmad Khanifar3Sulaiman Dzaiy4Timur Chis5https://orcid.org/0000-0003-1751-828XPetroleum Engineering Department, Petronas Carigali SDN BHD, MalaysiaPetroleum Engineering Department, Petronas Carigali SDN BHD, MalaysiaSultan Qaboos University, OmanPetroleum Engineering Department, Petronas Carigali SDN BHD, MalaysiaPetroleum-Gas University of Ploiesti, RomaniaPetroleum-Gas University of Ploiesti, RomaniaThis paper aims to develop a machine learning-based model for oil production rate prediction. The significance of feature dimension reduction is addressed by applying well-established approaches like Principal Component Analysis (PCA) and the proposed physics-driven feature creation technique. The physics-driven features, derived from experience or analytical modeling, introduce physical relevance and improve model quality. The study focuses on oil production prediction using a dataset that includes reservoir permeability, wellbore skin, reservoir pressure, net pay thickness, water cut, and well-liquid production rate. Several machine learning techniques, such as SVM, k-NN, Decision Tree, Random Forest, and linear regression, were constructed using PCA feature selection. The models were tuned and validated using k-fold cross-validation. The same models were then built using physics-driven features, and their performance metrics were compared. The results show significant improvement when applying the proposed physics-driven feature creation, compared to PCA. Over 10-fold cross-validation, PCA improved the R² performance metric by 10% (from 70% to 77%), while physics-driven features increased it by 20% (from 70% to 90% on average). The Random Forest and linear regression models outperformed the others, particularly when built based on physics-driven features. Additionally, models based on physics-driven features exhibited less sensitivity to data splits for learning and testing, proving more reliable with better performance metrics compared to those using original features.http://jpgt.upg-ploiesti.ro/wp-content/uploads/2024/12/22_RJPGT_no.2-2024-Physics-driven-feature-ML-models-performance-oil-production-prediction.pdfoil rate predictionfeature engineeringprincipal component analysisartificial intelligencemachine learning
spellingShingle Eghbal Motaei
Seyed Mehdi Tabatabai
Tarek Ganat
Ahmad Khanifar
Sulaiman Dzaiy
Timur Chis
PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
Romanian Journal of Petroleum & Gas Technology
oil rate prediction
feature engineering
principal component analysis
artificial intelligence
machine learning
title PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
title_full PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
title_fullStr PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
title_full_unstemmed PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
title_short PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
title_sort physics driven feature creation to improve machine learning models performance for oil production rate prediction
topic oil rate prediction
feature engineering
principal component analysis
artificial intelligence
machine learning
url http://jpgt.upg-ploiesti.ro/wp-content/uploads/2024/12/22_RJPGT_no.2-2024-Physics-driven-feature-ML-models-performance-oil-production-prediction.pdf
work_keys_str_mv AT eghbalmotaei physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction
AT seyedmehditabatabai physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction
AT tarekganat physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction
AT ahmadkhanifar physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction
AT sulaimandzaiy physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction
AT timurchis physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction