Exploration Insights December 2019/ January 2020 | Page 14
14 | Halliburton Landmark
Exploration Insights | 15
uses multi-linear regression to predict classes
of success (Figure 4). However, the greatest
prediction accuracy achieved using the SGD
classifier was 46%. Decision Tree and Random
Forest classifiers are more complex algorithms,
which have been successfully used to help improve
production in conventional and unconventional
plays. The Decision Tree classifier is built by
repeatedly splitting the data until the statistical
majority of the data belonging to each of the leaf
nodes, or predicted classes, has the same label (Du
et al., 2002). Figure 5 shows an example of how
this could be done using labels from this dataset.
However, a maximum prediction accuracy score of
55% was achieved with this fine-tuned algorithm.
1) Root Node
TVD < 250 m
Gini = 0.7
Samples = 150
Value = [50, 50, 50]
Class = 1
True
False
3) Leaf Node
Gini = 0.0
Samples = 150
Value = [50, 0, 0]
Class = 1
2) Split Node
Average thickness < 100
Gini = 0.5
Samples = 100
Value = [10, 50, 50]
Class = 2
3) Leaf Node
3) Leaf Node
Gini = 0.15
Samples = 54
Value = [0, 49, 5]
Class = 2
Gini = 0.05
Samples = 46
Value = [0, 1, 45]
Class = 3
Figure 5> Schematic of the anatomy of a Decision Tree
classifier model with an example of input data based on the
geological attributes from this study.
Random Forest
Instance
Random Forest aggregates a collection of
Decision Trees, classifying data by determining
the class most commonly predicted by the
group of trees, forming a data forest. This
limits variance by training on different samples,
introducing randomness into the model and
creating greater tree diversity (Géron, 2017)
(Figure 6). This is reflected in a maximum score
of 97% accuracy using four classes, splitting the
data into quartiles of normalized initial production
success. Systematic misclassification error
consistently occurred within the mid-range
classes and were reduced by decreasing the
number of classes of production success from
quintiles to quartiles to increase the number
of data points per class. The geological input
features used in this most accurate model
include play thickness, pore pressure, true
vertical depth (TVD), and resource concentration.
FACTORS INFLUENCING
PRODUCTION
Tree-1 Tree-2 Tree-n
Class-X Class-Y Class-X
Majority Voting
Final Class
Figure 6> Schematic of the Random Forest classification
model process (Dimitriadis & Liparas, 2018).
Extraction of the geological input features most
strongly correlated to production success in
each unconventional play provided insight into
the fundamental geological differences between
U.S. unconventional plays. This insight is valuable
when exploring, appraising, developing, and
producing from these plays and analogous play
types, globally.
Our analysis demonstrates that pore pressure
gradient is strongly correlated with production
in the majority of plays. Pore pressure is related
to increased flow rate in areas of overpressure
(Darcy’s Law), and reduction in the effective
Feature Weighting - Middle Bakken Shale Landing Zone
Rank Feature Weighting
1 Pore Pressure 0.578
2 Reservoir Pressure 0.234
3 Maximum Burial Temperature 0.0788
4 Porosity 0.0373
5 Average Thickness 0.0265
6 Geothermal Gradient 0.0212
7 Resource Concentration 0.0174
8 TVD 0.00618
9 GOR 0.00101
Table 2> Feature weightings of the impact of each geological
parameter on normalized initial production found using Feature
Extraction analysis of the Middle Bakken Member landing zone.
stress required to induce hydraulic fracture.
However, the importance weighting between
geological variables and production rate varies
significantly between plays. To demonstrate this,
two of the most prolific U.S. plays, the Bakken
and Marcellus plays, were analyzed in order to
understand the geological factors that impact
their success.
Bakken
The Middle Bakken is considered to be a hybrid
system, or tight reservoir (Jarvie, 2012). Key areas
of hydrocarbon production are associated with
trapping mechanisms that enhance hydrocarbon
saturation within the Middle Bakken target
unit. These subtle traps can be structural and/
or stratigraphic, and relate to deformation within
the Williston Basin, or lithological heterogeneity
within the Middle Bakken.
Hydrocarbon generation in the thermally mature
kitchen increases the pore pressure gradient,
resulting in hydrocarbon migration and trapping
in pressure cells bounded by effective lateral
and top seals. The Parshall Field is an example
of a production sweet-spot where significant
overpressure has developed within an isolated
pressure cell, driving a high production rate from
the Middle Bakken target. This is reflected in the
heavily skewed feature importance weightings
in Table 2, showing that production is highly
sensitive to pore pressure.
Marcellus
In contrast to the Bakken, the Marcellus is a
continuous shale play in which hydrocarbon
saturation is pervasive and relatively constant.
Within the Marcellus play, unconventional
reservoir quality is highly sensitive to TOC
content, as this influences the development of
the organic-hosted porosity that provides the
hydrocarbon storage capacity within the play
(Zagorski et al., 2011).
There are two production sweet-spots with the
Marcellus play, the south-western core and the
north-eastern core. Within the south-western
Feature Weighting - Marcellus Shale
Rank Feature
Weighting
1 Average Thickness 0.186
2 Porosity 0.184
3 Resource Concentration 0.153
4 TVD 0.127
5 Pore Pressure 0.111
6 Maximum Burial Temperature 0.102
7 Reservoir Pressure 0.0932
8 GOR 0.0429
9 Geothermal Gradient 0
Limiting factor of the
southwestern core
Limiting factor of the
northeastern core
Table 3> Feature weightings of the impact of each geological parameter on normalized initial production found using Feature Extraction
analysis of the Marcellus Shale.