PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION
This paper aims to develop a machine learning-based model for oil production rate prediction. The significance of feature dimension reduction is addressed by applying well-established approaches like Principal Component Analysis (PCA) and the proposed physics-driven feature creation technique. The p...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Petroleum-Gas University of Ploiesti
2024-12-01
|
Series: | Romanian Journal of Petroleum & Gas Technology |
Subjects: | |
Online Access: | http://jpgt.upg-ploiesti.ro/wp-content/uploads/2024/12/22_RJPGT_no.2-2024-Physics-driven-feature-ML-models-performance-oil-production-prediction.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832574284161089536 |
---|---|
author | Eghbal Motaei Seyed Mehdi Tabatabai Tarek Ganat Ahmad Khanifar Sulaiman Dzaiy Timur Chis |
author_facet | Eghbal Motaei Seyed Mehdi Tabatabai Tarek Ganat Ahmad Khanifar Sulaiman Dzaiy Timur Chis |
author_sort | Eghbal Motaei |
collection | DOAJ |
description | This paper aims to develop a machine learning-based model for oil production rate prediction. The significance of feature dimension reduction is addressed by applying well-established approaches like Principal Component Analysis (PCA) and the proposed physics-driven feature creation technique. The physics-driven features, derived from experience or analytical modeling, introduce physical relevance and improve model quality. The study focuses on oil production prediction using a dataset that includes reservoir permeability, wellbore skin, reservoir pressure, net pay thickness, water cut, and well-liquid production rate. Several machine learning techniques, such as SVM, k-NN, Decision Tree, Random Forest, and linear regression, were constructed using PCA feature selection. The models were tuned and validated using k-fold cross-validation. The same models were then built using physics-driven features, and their performance metrics were compared. The results show significant improvement when applying the proposed physics-driven feature creation, compared to PCA. Over 10-fold cross-validation, PCA improved the R² performance metric by 10% (from 70% to 77%), while physics-driven features increased it by 20% (from 70% to 90% on average). The Random Forest and linear regression models outperformed the others, particularly when built based on physics-driven features. Additionally, models based on physics-driven features exhibited less sensitivity to data splits for learning and testing, proving more reliable with better performance metrics compared to those using original features. |
format | Article |
id | doaj-art-33ceada318184734b0d8d418a31e48a0 |
institution | Kabale University |
issn | 2734-5319 2972-0370 |
language | English |
publishDate | 2024-12-01 |
publisher | Petroleum-Gas University of Ploiesti |
record_format | Article |
series | Romanian Journal of Petroleum & Gas Technology |
spelling | doaj-art-33ceada318184734b0d8d418a31e48a02025-02-01T20:44:04ZengPetroleum-Gas University of PloiestiRomanian Journal of Petroleum & Gas Technology2734-53192972-03702024-12-015229130610.51865/JPGT.2024.02.22PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTIONEghbal Motaei0https://orcid.org/0000-0002-8191-2604Seyed Mehdi Tabatabai1https://orcid.org/0000-0001-5948-4343Tarek Ganat2https://orcid.org/0009-0000-7963-0264Ahmad Khanifar3Sulaiman Dzaiy4Timur Chis5https://orcid.org/0000-0003-1751-828XPetroleum Engineering Department, Petronas Carigali SDN BHD, MalaysiaPetroleum Engineering Department, Petronas Carigali SDN BHD, MalaysiaSultan Qaboos University, OmanPetroleum Engineering Department, Petronas Carigali SDN BHD, MalaysiaPetroleum-Gas University of Ploiesti, RomaniaPetroleum-Gas University of Ploiesti, RomaniaThis paper aims to develop a machine learning-based model for oil production rate prediction. The significance of feature dimension reduction is addressed by applying well-established approaches like Principal Component Analysis (PCA) and the proposed physics-driven feature creation technique. The physics-driven features, derived from experience or analytical modeling, introduce physical relevance and improve model quality. The study focuses on oil production prediction using a dataset that includes reservoir permeability, wellbore skin, reservoir pressure, net pay thickness, water cut, and well-liquid production rate. Several machine learning techniques, such as SVM, k-NN, Decision Tree, Random Forest, and linear regression, were constructed using PCA feature selection. The models were tuned and validated using k-fold cross-validation. The same models were then built using physics-driven features, and their performance metrics were compared. The results show significant improvement when applying the proposed physics-driven feature creation, compared to PCA. Over 10-fold cross-validation, PCA improved the R² performance metric by 10% (from 70% to 77%), while physics-driven features increased it by 20% (from 70% to 90% on average). The Random Forest and linear regression models outperformed the others, particularly when built based on physics-driven features. Additionally, models based on physics-driven features exhibited less sensitivity to data splits for learning and testing, proving more reliable with better performance metrics compared to those using original features.http://jpgt.upg-ploiesti.ro/wp-content/uploads/2024/12/22_RJPGT_no.2-2024-Physics-driven-feature-ML-models-performance-oil-production-prediction.pdfoil rate predictionfeature engineeringprincipal component analysisartificial intelligencemachine learning |
spellingShingle | Eghbal Motaei Seyed Mehdi Tabatabai Tarek Ganat Ahmad Khanifar Sulaiman Dzaiy Timur Chis PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION Romanian Journal of Petroleum & Gas Technology oil rate prediction feature engineering principal component analysis artificial intelligence machine learning |
title | PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION |
title_full | PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION |
title_fullStr | PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION |
title_full_unstemmed | PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION |
title_short | PHYSICS-DRIVEN FEATURE CREATION TO IMPROVE MACHINE LEARNING MODELS PERFORMANCE FOR OIL PRODUCTION RATE PREDICTION |
title_sort | physics driven feature creation to improve machine learning models performance for oil production rate prediction |
topic | oil rate prediction feature engineering principal component analysis artificial intelligence machine learning |
url | http://jpgt.upg-ploiesti.ro/wp-content/uploads/2024/12/22_RJPGT_no.2-2024-Physics-driven-feature-ML-models-performance-oil-production-prediction.pdf |
work_keys_str_mv | AT eghbalmotaei physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction AT seyedmehditabatabai physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction AT tarekganat physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction AT ahmadkhanifar physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction AT sulaimandzaiy physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction AT timurchis physicsdrivenfeaturecreationtoimprovemachinelearningmodelsperformanceforoilproductionrateprediction |