Exploration Insights December 2019/ January 2020 | Page 14

14 | Halliburton Landmark Exploration Insights | 15 uses multi-linear regression to predict classes of success (Figure 4). However, the greatest prediction accuracy achieved using the SGD classifier was 46%. Decision Tree and Random Forest classifiers are more complex algorithms, which have been successfully used to help improve production in conventional and unconventional plays. The Decision Tree classifier is built by repeatedly splitting the data until the statistical majority of the data belonging to each of the leaf nodes, or predicted classes, has the same label (Du et al., 2002). Figure 5 shows an example of how this could be done using labels from this dataset. However, a maximum prediction accuracy score of 55% was achieved with this fine-tuned algorithm. 1) Root Node TVD < 250 m Gini = 0.7 Samples = 150 Value = [50, 50, 50] Class = 1 True False 3) Leaf Node Gini = 0.0 Samples = 150 Value = [50, 0, 0] Class = 1 2) Split Node Average thickness < 100 Gini = 0.5 Samples = 100 Value = [10, 50, 50] Class = 2 3) Leaf Node 3) Leaf Node Gini = 0.15 Samples = 54 Value = [0, 49, 5] Class = 2 Gini = 0.05 Samples = 46 Value = [0, 1, 45] Class = 3 Figure 5> Schematic of the anatomy of a Decision Tree classifier model with an example of input data based on the geological attributes from this study. Random Forest Instance Random Forest aggregates a collection of Decision Trees, classifying data by determining the class most commonly predicted by the group of trees, forming a data forest. This limits variance by training on different samples, introducing randomness into the model and creating greater tree diversity (Géron, 2017) (Figure 6). This is reflected in a maximum score of 97% accuracy using four classes, splitting the data into quartiles of normalized initial production success. Systematic misclassification error consistently occurred within the mid-range classes and were reduced by decreasing the number of classes of production success from quintiles to quartiles to increase the number of data points per class. The geological input features used in this most accurate model include play thickness, pore pressure, true vertical depth (TVD), and resource concentration. FACTORS INFLUENCING PRODUCTION Tree-1 Tree-2 Tree-n Class-X Class-Y Class-X Majority Voting Final Class Figure 6> Schematic of the Random Forest classification model process (Dimitriadis & Liparas, 2018). Extraction of the geological input features most strongly correlated to production success in each unconventional play provided insight into the fundamental geological differences between U.S. unconventional plays. This insight is valuable when exploring, appraising, developing, and producing from these plays and analogous play types, globally. Our analysis demonstrates that pore pressure gradient is strongly correlated with production in the majority of plays. Pore pressure is related to increased flow rate in areas of overpressure (Darcy’s Law), and reduction in the effective Feature Weighting - Middle Bakken Shale Landing Zone Rank Feature Weighting 1 Pore Pressure 0.578 2 Reservoir Pressure 0.234 3 Maximum Burial Temperature 0.0788 4 Porosity 0.0373 5 Average Thickness 0.0265 6 Geothermal Gradient 0.0212 7 Resource Concentration 0.0174 8 TVD 0.00618 9 GOR 0.00101 Table 2> Feature weightings of the impact of each geological parameter on normalized initial production found using Feature Extraction analysis of the Middle Bakken Member landing zone. stress required to induce hydraulic fracture. However, the importance weighting between geological variables and production rate varies significantly between plays. To demonstrate this, two of the most prolific U.S. plays, the Bakken and Marcellus plays, were analyzed in order to understand the geological factors that impact their success. Bakken The Middle Bakken is considered to be a hybrid system, or tight reservoir (Jarvie, 2012). Key areas of hydrocarbon production are associated with trapping mechanisms that enhance hydrocarbon saturation within the Middle Bakken target unit. These subtle traps can be structural and/ or stratigraphic, and relate to deformation within the Williston Basin, or lithological heterogeneity within the Middle Bakken. Hydrocarbon generation in the thermally mature kitchen increases the pore pressure gradient, resulting in hydrocarbon migration and trapping in pressure cells bounded by effective lateral and top seals. The Parshall Field is an example of a production sweet-spot where significant overpressure has developed within an isolated pressure cell, driving a high production rate from the Middle Bakken target. This is reflected in the heavily skewed feature importance weightings in Table 2, showing that production is highly sensitive to pore pressure. Marcellus In contrast to the Bakken, the Marcellus is a continuous shale play in which hydrocarbon saturation is pervasive and relatively constant. Within the Marcellus play, unconventional reservoir quality is highly sensitive to TOC content, as this influences the development of the organic-hosted porosity that provides the hydrocarbon storage capacity within the play (Zagorski et al., 2011). There are two production sweet-spots with the Marcellus play, the south-western core and the north-eastern core. Within the south-western Feature Weighting - Marcellus Shale Rank Feature Weighting 1 Average Thickness 0.186 2 Porosity 0.184 3 Resource Concentration 0.153 4 TVD 0.127 5 Pore Pressure 0.111 6 Maximum Burial Temperature 0.102 7 Reservoir Pressure 0.0932 8 GOR 0.0429 9 Geothermal Gradient 0 Limiting factor of the southwestern core Limiting factor of the northeastern core Table 3> Feature weightings of the impact of each geological parameter on normalized initial production found using Feature Extraction analysis of the Marcellus Shale.