Finding Autoregression with energy data
Consider only previous value of Consumed energy data. The data were obtained previously, cleaned and kept only Consumed_active_energy_kW. Autoregression implies the prediction using the previous values in a given lag, i.e., Predict x(t+1) from x(t-2), x(t-1), x(t), etc.
First, we must find autocorrelation that is whether x(t) is dependent on x(t-1), x(t-2), etc. or not. If so, then we can find autoregression.
We can use statsmodel to help here. In this code, we use autocorrelation_plot to find relationship between x(t+1),x(t-1). It also plots the autocorrelation with different lag time.

||||t-1| t+1
t-1 |1.0| 1.0
t+1 |1.0|1.0
The results show value 1.0 which is highly correlated.
We then create simple autoregression. We use previous value t-1 to predict the next value.
We divide the train set to be 67% and the remainder is test set. One can find the meaningful way for division. For example, use 2 weeks data and predict the next 5 days. We can calculate the number of prediction points based on the time for each point. Then, the number of prediction points are taken out as a test set and the remainder is the train set.
Let us use pandas shift to reorganize the input/output. We use x(t-1) as input to predict x(t). mean_squared_error(test_y, predictions) is used to meansured the quality of the prediction. test_y is the test set and predictions are predicted values from the model.
As we predict one point, we insert to the model and predict the next x(t+1).
for x in test_X:
yhat = model_persistence(x)
predictions.append(yhat)
Then, we can see the plot of prediction and the test_y as:

which yields:
Test MSE: 0.160
Test RMSE: 0.400
Thus, this simple model can handle the prediction well.
Next, we will use ARIMA for model prediction.