Artificial and Human Intelligence with Digital Twins
case is to create a set of lagged variables for
the predictors and the response variable.
The response variable is the energy
produced.
You can use RNNs for one-step-ahead
forecasting where the forecast interval
matches, or is less than, the desired forecast
interval. This will yield the most accurate
forecast. In some cases, you might need a
multistep forecast to project future time
periods based on the near term forecast
estimates. These forecasts are typically less
accurate but can be tested to determine if
they have sufficient accuracy.
To train this RNN, take the historical input
database and create lagged variables for the
predictors and response variable. The
number of lags is determined by the time
interval of the measurement data and the
expected
correlation
of
previous
measurements on the forecast time horizon.
For the solar farm example, we are
producing one-hour-ahead forecasts, and
the data over the last few hours is sufficient
to capture the primary effects for the
forecast. Note that there are a large variety
of conditions possible throughout the year
and previously observed weather, even
though the forecast horizon is fairly short.
Since we have a large amount of historical
data of the various conditions, the use of an
RNN is appropriate for this problem.
Reinforcement learning
Reinforcement learning (RL) is a subfield of
machine learning and deals with sequential
decision-making
in
a
stochastic
environment. In any RL problem, there is at
least one agent and an environment. The
agent observes the state of the environment
and takes and executes a decision. The
environment returns a reward and a new
state in response to the action. With the new
state, the agent takes and executes another
action, the environment returns a reward
and new state and this procedure continues
iteratively. RL algorithms are designed to
train an agent through this interaction with
the environment, and the goal is maximizing
the summation of rewards.
Since training and evaluating the RNN model
is dependent on the sequence, partitioning
the data requires more care than typical
random partitioning. In this case, we need to
preserve the sequence of the data for use in
the model creation steps (training,
validation, test). The easiest way to do this is
to partition the data based on the time
variable. Use the earliest historical data for
the training data set. Then use the next time
partition for the validation data set. Finally,
use the most recent data for the testing data
set. This is sufficient if the performance of
the asset has been consistent over the
historical data sample. If there have been
periods of degraded performance, it is best
to eliminate that data from the data sets
used to create the model.
RL has recently received much attention due
to its successes in computer games and
- 66 -
November 2019