Master of the Khazar University, Azerbaijan, Baku
MACHINE LEARNING AND DEEP LEARNING - ENHANCED PRODUCTION DECLINE CURVE ANALYSIS FOR IMPVODED OIL RECOVERY FORECASTING
ABSTRACT
The accurate forecasting of oil production is critical for efficient reservoir management, strategic planning, and maximizing hydrocarbon recovery. Traditional Decline Curve Analysis (DCA) techniques, such as exponential, hyperbolic, and harmonic models, have long served as standard tools in petroleum engineering. However, their effectiveness diminishes in unconventional reservoirs or operational environments characterized by nonlinear dynamics, irregular production trends, and enhanced oil recovery (EOR) interventions.
This study introduces a data-driven framework that integrates Machine Learning (ML) and Deep Learning (DL) techniques into the DCA process to improve production forecasting accuracy. Using the publicly available Volve Field dataset, predictive models including ARIMA, Random Forest (RF), and Long Short-Term Memory (LSTM) networks were developed and evaluated. Feature engineering and time-series data preprocessing were employed to enhance model performance and adaptability to changing reservoir behavior.
АННОТАЦИЯ
Точное прогнозирование добычи нефти имеет решающее значение для эффективного управления резервуаром, стратегического планирования и максимизации извлечения углеводородов. Традиционные методы анализа кривой падения добычи (DCA), такие как экспоненциальные, гиперболические и гармонические модели, долгое время служили стандартными инструментами в нефтяной инженерии. Однако их эффективность снижается в нетрадиционных резервуарах или эксплуатационных условиях, характеризующихся нелинейной динамикой, нерегулярными тенденциями добычи и вмешательствами по повышению нефтеотдачи (EOR).
В этом исследовании представлена управляемая данными структура, которая интегрирует методы машинного обучения (ML) и глубокого обучения (DL) в процесс DCA для повышения точности прогнозирования добычи. Используя общедоступный набор данных Volve Field, были разработаны и оценены прогностические модели, включая сети ARIMA, Random Forest (RF) и Long Short-Term Memory (LSTM). Для повышения производительности модели и ее адаптивности к изменяющемуся поведению резервуара использовались проектирование признаков и предварительная обработка данных временных рядов.
Keywords: Decline Curve Analysis, Machine Learning, Deep Learning, LSTM, Oil Production Forecasting.
Ключевые слова: Анализ кривой падения добычи, машинное обучение, глубокое обучение, LSTM, прогнозирование добычи нефти.
Introduction
In the context of modern petroleum engineering, the convergence of traditional reservoir analysis techniques with artificial intelligence (AI) methodologies—specifically machine learning (ML) and deep learning (DL)—represents a significant shift in the way oil recovery forecasting is approached. One of the most enduring and widely used techniques in oil production forecasting is Decline Curve Analysis (DCA). Rooted in empirical modeling, DCA involves fitting mathematical functions such as exponential, hyperbolic, or harmonic decline models to historical production data in order to predict future performance. These models have been highly effective in conventional reservoirs, where reservoir behavior is relatively stable and well understood (Arps, 1945; Al-Kaabi & Khan, 2020).
However, the increasing complexity of modern oil and gas fields, particularly unconventional plays such as shale formations, tight gas reservoirs, and reservoirs subjected to Enhanced Oil Recovery (EOR) techniques, has exposed the limitations of traditional DCA models. These models typically assume a stationary and smooth production decline that fails to account for nonlinear dynamics, operational fluctuations, and multi-well interference effects. Moreover, they often require a significant degree of manual calibration and engineering judgment, which introduces subjectivity and limits scalability (Zhou & Li, 2023).
Recent advancements in data science have introduced ML and DL algorithms as powerful tools for forecasting complex production behaviors. Unlike classical DCA, these techniques are data-driven and capable of learning patterns directly from large datasets without requiring predefined functional forms. ML algorithms such as Random Forests, Support Vector Machines (SVMs), and Gradient Boosted Trees have been used to capture nonlinear interactions between operational parameters and production outcomes. Meanwhile, DL models such as Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), and Transformer-based architectures are designed specifically to model temporal sequences and long-range dependencies, making them particularly suited for time-series production data (Lim et al., 2021; Roustazadeh et al., 2022).
The integration of these AI techniques into the DCA workflow enhances both forecast accuracy and decision-making capabilities. These models can adapt to changing reservoir conditions, automatically incorporate operational variables (such as pressure, temperature, choke size), and even detect early signs of production anomalies. Their ability to generalize across multiple wells and update predictions in real-time provides substantial benefits in dynamic and data-intensive environments.
In this context, the aim of the present research is to develop a comprehensive DCA framework enhanced by machine learning and deep learning methodologies. By comparing traditional empirical models with AI-based approaches using the Volve Field dataset from Equinor, this study evaluates the practical efficacy of different forecasting techniques in real-world scenarios. The findings are intended to inform the development of more accurate, scalable, and adaptive production forecasting systems that meet the evolving demands of the petroleum industry.
Methodology
This study uses the publicly available Volve Field dataset, focusing on Well 15/9-F-14. Preprocessing involved data cleaning, normalization, and feature engineering. Models developed include:
- ARIMA for statistical time-series baseline,
- Random Forest (RF) for handling nonlinear multivariate relationships,
- LSTM for modeling temporal dependencies.
Model performance was assessed using MAE and RMSE metrics.
Main part
Machine learning (ML) and deep learning (DL) models offer flexibility and adaptability, capturing complex patterns that traditional models may miss (Raissi et al., 2019). To ensure these models are not overfitting and are generalizable, validation techniques such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R², and cross-validation must be applied.
Traditional models are favored for regulatory reporting and early-stage planning due to their clear parameters and transparency (Mohaghegh, 2017). Meanwhile, ML/DL models can adjust to external variables and evolving conditions, but their complexity requires thorough validation to gain user trust.
A key difference lies in model stability. Traditional models may offer consistent long-term forecasts in stable fields but fail during production disruptions. ML models, if properly validated, can retrain and respond dynamically (Lim et al., 2021).
Finally, model selection should consider not just accuracy but also computational efficiency, deployment ease, and integration with operational workflows. An ML model may outperform traditional methods in error metrics but may also demand more resources and technical expertise.
Case studies offer crucial evidence for assessing the real-world effectiveness of machine learning (ML) and deep learning (DL) in the oil and gas industry. As operations grow more data-driven, companies increasingly adopt these models to enhance reservoir analysis, optimize production, and reduce risk (Zhou & Li, 2023).
Globally, from conventional wells to shale plays, ML-based tools have improved decline curve analysis, automated well monitoring, and supported field development. These models often surpass traditional approaches in accuracy and responsiveness, especially when handling large, irregular datasets.
Case studies demonstrate how model selection depends on factors like data quality, reservoir type, and strategic goals. They also show how ML integrates with digital oilfield systems—e.g., production dashboards and asset management platforms—highlighting scalability, return on investment, and strategic value (Tadjer et al., 2021).
In North America, deep learning models such as LSTMs and Transformers have successfully predicted production in complex shale environments. In the Middle East and Latin America, hybrid models combining physics-based methods with ML have improved waterflooding and reservoir simulations (Hosseini & Akilan, 2023). These approaches enhance forecast accuracy and speed, enabling agile decision-making.
Offshore, real-time sensor data and ML are used for intelligent well monitoring, anomaly detection, and predictive maintenance, reducing downtime and improving safety. These applications stress the importance of continuous data integration, model retraining, and collaboration between engineers and data scientists.
Overall, real-world implementations confirm ML’s ability to transform oilfield practices—boosting efficiency, reducing costs, and supporting smarter reservoir management (Raissi et al., 2019). They also offer valuable insights into overcoming challenges and shaping the future of intelligent petroleum operations.
/Aqil.files/image001.png)
Figure 1. Forecast accuracy comparison in real-world oil field case studies
The chart compares forecast accuracy across four oil field types—Shale (Field A), Offshore (Field B), Waterflood (Field C), and Onshore (Field D)—using traditional and machine learning (ML)-based methods. In every case, ML models outperform traditional decline curve approaches. For example, accuracy in Field A rises from 72% (traditional) to 88% (ML), and in Field B from 68% to 85%. Similar improvements are observed in Field C (74% to 89%) and Field D (70% to 86%).
These results confirm the superior ability of ML models to capture complex, nonlinear production behaviors across diverse reservoir types. Unlike traditional models, which rely on predefined decline trends and often underperform under irregular production or operational shifts, ML models learn directly from historical data and adapt to dynamic conditions (Mohaghegh, 2017).
The consistent accuracy advantage also highlights the scalability of ML forecasting. Once trained, ML models can be applied to multiple wells or fields with minimal adjustments—supporting large-scale deployment in digital oilfields. Improved forecast precision enables better operational planning, reduced uncertainty in production estimates, and enhanced financial projections.
Overall, the chart illustrates that machine learning is not merely a theoretical innovation but a practical solution that enhances decision-making, minimizes risk, and supports more efficient reservoir management strategies.
Conclusion
This study compares traditional hyperbolic decline curve analysis (DCA) with three data-driven models: ARIMA, Random Forest, and LSTM neural networks.
- ARIMA models short-term trends but lacks accuracy for nonlinear, long-term behavior (RMSE: 30.97).
- Random Forest offers flexibility with multivariate inputs but performs poorly on temporal patterns (RMSE: 32.64).
- LSTM significantly outperforms others (RMSE: 4.21), effectively capturing complex decline trends, shut-ins, and operational changes.
Findings show that deep learning models, especially LSTM, provide higher accuracy and adaptability than traditional methods. They support real-time monitoring, handle irregular production patterns, and scale well across multiple fields.
Key implementation factors include data preprocessing, hyperparameter tuning, and robust model validation. Future work should focus on hybrid physics–AI models, explainable AI, and deployment platforms tailored to reservoir conditions.
In conclusion, machine learning enhances—not replaces—traditional DCA, offering a more precise and adaptive forecasting framework for modern petroleum operations.
Reference:
- Arps, J. J. (1945). Analysis of decline curves. Transactions of the AIME, 160(01), 228–247. https://doi.org/10.2118/945228-G
- Hosseini, S., & Akilan, T. (2023). Advanced deep regression models for forecasting time series oil production. arXiv preprint arXiv:2308.16105.
- Lim, B., Arik, S. Ö., Loeff, N., & Pfister, T. (2021). Time-series forecasting with deep learning: A survey. arXiv preprint arXiv:2104.13478.
- Mohaghegh, S. D. (2017). Data-Driven Reservoir Modeling. Society of Petroleum Engineers.
- Raissi, M., Yazdani, A., & Karniadakis, G. E. (2019). Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science, 367(6481), 1026–1030.
- Tadjer, A., Hong, A., & Bratvold, R. B. (2021). Machine learning based decline curve analysis for short-term oil production forecast. Energy Exploration & Exploitation, 39(5), 1741–1760.
- Zhou, Y., & Li, Y. (2023). Machine learning-based decline curve analysis for short-term oil production forecast. Journal of Petroleum Science and Engineering, 208, 109451.