An ARIMA Model for Electricity Consumption Forecast Alfred Hofmann1,, Brigitte Apfel1, Ursula Barth1, Christine Günther1, Ingrid Haas1, Frank Holzwarth1, Anna Kramer1, Leonie Kunz1,Nicole Sator1, Erika Siebert-Cole1 and Peter Straßer1,1 Springer-Verlag, Computer Science Editorial, Tiergartenstr. 17,69121 Heidelberg, Germany{Alfred.

Hofmann, Brigitte.Apfel, Ursula.Barth, Christine.Guenther, Ingrid.

Haas, Frank.Holzwarth, Anna.Kramer, Leonie.Kunz,Nicole.Sator, Erika.Siebert-Cole, Peter.

Strasser, LNCS}@Springer.comAbstract. The demand for electricity has been continuously increasing over the years. To understand the future consumption, a good predictive model is entailed.

The ARIMA models have been extensively used for time series prediction showing encouraging results. In this paper, an attempt is made on forecasting the electricity consumption in the IIT(ISM) Dhanbad using the ARIMA model. Using the mean absolute percentage error (MAPE) to measure forecast accuracy, the model was able to forecast with an error of 6.63%. //Results obtained revealed that the ARIMA model has a strong potential for the prediction and can compete favourably with existing techniques for electricity consumption forecast.//Keywords: electricity consumption, forecast, ARIMA.1 IntroductionElectricity has become a necessity of everyday life – powering technology like our cell phones, computers, lights and air conditioners and the demand for it has been continuously increasing in every sector 1.

The increased dependency on the electronic and electrical appliances necessitates the need for future demand forecast. Electricity consumption forecast plays an integral role in planning the future in terms of the size, location and type of the future generating plants as well as in deciding and planning for maintenance of the existing power systems. Also, it minimises the risk on utility companies and helps them determine the required resources. The demand pattern clearly shows an increasing trend with strong annual changes and also gets complex due to the deregulation of energy markets.The ARIMA model has been extensively used in forecasting economic, stock prices, marketing, social problems, industrial production etc.

It is a statistical analysis model known to be efficient and robust for short-term forecast and requires at least 40 past data points values.In this paper, the electricity consumption in IIT(ISM) Dhanbad for the year 2008-09 is forecasted based on data from the year 2004 to 2008 using the ARIMA models, and then root mean square error (RMSE) and mean absolute percentage error (MAPE) is used to select the best model as the basis of model performance.Rest of the paper is organised as follows. In Section 2, the approaches used in the earlier research paper in the forecast of electricity consumption are reviewed and Section 3 presents a brief overview of ARIMA model. Section 4, then lays out the dataset used. Section 4 describes the methodology employed while in section 5, the experimental results obtained are presented and analysed. Last of all, section 6 finalises this paper with a conclusion and future research potentials.

2 Related WorkFor forecasting the electricity consumption, there are different methods deployed by researchers. Author2 presented an integrated framework based on Artificial Neural Network, MultiLayer Perceptron, conventional regression and design of experiment for forecasting household electricity consumption using five input variables viz. electricity price, urban house income, urban household size, refrigerator price index and TV price index. Different variations are observed in the load profile of consumers depending on income level, residence type and locality as well as environmental factors 3,4. Author5 adopted data mining techniques for analysis of electricity consumption in order to extract information using the K-means clustering algorithm. Authors6,7 carried out an analysis of seasonal electricity consumption and made an attempt in recognising environmental effects on the consumption. 3 ARIMA Model – A OverviewThe ARIMA model, also known as Box-Jenkins has been widely used for short-term forecast.

The Autoregressive (AR) part of the ARIMA indicates the regression of the time series over it own lagged values, Integrated (I) indicates that the values have undergone differencing and the Moving Average (MA) Indicates weighted moving average over regression errors.A Non-seasonal ARIMA model is represented as ARIMA(p ,d ,q) where p = order (number of time lags) of AR modeld = degree of integration (differencing)q = order of the MASeasonal ARIMA model is represented as ARIMA(p, d, q)(P,D,Q)m, where the P,D,Q depict the autoregressive, differencing, and moving average terms for the seasonal part of the ARIMA model and m is the number of periods in each season.In order to estimate the values for the various terms of ARIMA model, the step involves finding autocorrelation and partial autocorrelation between the values of the data. Autocorrelation is the correlation of a time series with a delayed copy of itself and is defined as ACF = corr(Xt, Xt+k).

Here Xt and Xt+k are the current observation and the observation after k period respectively. Partial Auto-Correlation (PACF) is the partial correlation of Xt+k with Xt i.e. it controls the values of the time series at all shorter lags which ACF does not.

It is defined for positive lag only with values lying between -1 and +1. Table I gives the idea as how to make the estimation for initial values of ARIMA (p, d, q).3 DatasetThe monthly electricity consumption data of IIT(ISM) Dhanbad from July 2004 to June 2009 is taken into consideration. The data specified the unit consumed (in kWh) for every month between the mTable 1. Properties of ACF and PACF for AR, MA and ARMA.

PropertiesAR (p)MA (q)ARMA(p, q)ACFDecayscuts after q lagsDecaysPACFCuts after p lagsDecaysDecays4 Methodology UsedIn order to build an ARIMA model, the steps used are as follow:Step 1. Data Visualisation.The data is visualised and it is determined whether the data shows any overall trend or seasonal trend. The time series data was decomposed into constituents viz. Trend, Seasonality and Residual Values. The trend would represent the optional and often linear increasing or decreasing behaviour of the series over time whereas the seasonality would depict its optional repeating patterns or cycles of behaviour over time. The residual values take out the trend and seasonality from the data, making them independent of time.

The seasonal_decompose function in statsmodels was used for the same.Step 2. Stationarity TestingA time series is stationary if its statistical properties such as mean, variance are constant over time. A time series exhibiting a particular behaviour over time, has a very high probability that in the future it would follow the same. Also, in a time series, observations are dependent on time, but a linear regression assumes all the observations to be independent of each other. So stationarising the data could enable us to apply regression techniques to time dependent variables.

The two major reasons for the time series to be non-stationary is the trend and the seasonality. The series is made stationary by estimating the trend and seasonality and eliminating them from the series. For this purpose the logarithmic transformation and differencing methods are applied.Step 3. Deduction of Optimal ParametersACF and PACF are used to determine the suitable model parameters.Step 4.

Model ValidationThis step involved validating the model using statistics and confidence intervals and tracking of model performances.Step 5. ForecastThe best model obtained is implemented on the series and used to forecast the future values. The values are reverted back to the original scale.

5 ResultsThe electricity consumption in IIT(ISM) during the period July 2004 to June 2009 is depicted in Fig. 1 and Fig. 2 represents the constituents of the time series viz. are trend, seasonality and residual values. It can be observed that the electricity consumption data contains both an overall upward trend and has a seasonality to it.

Since the time series data has seasonality to it, seasonal ARIMA was used for forecasting. Fig. 1. Electricity consumption data for period 2004-09. Fig. 2. Constituents of the electricity data.

The presence of trend and seasonality makes the data non-stationary and the same can be confirmed by the rolling statistics and Dickey-Fuller test on the electricity consumption data as illustrated in fig. 3 and Table 2 respectively. Although slight change in standard deviation is seen, but it can be clearly observed that the mean is varying with time.

Also, the test statistics confirm the same since it is greater than the the critical values.Fig. 3.

Rolling Statistics for the electricity data.Table 2. Results of Dickey-Fuller test on the electricity data. StatisticsValueTest Statistics-2.

144462p-value0.227016#Lags Used7Number of Observations52Critical Value (1%)-3.562879Critical Value (5%)-2.918973Critical Value (10%)-2.597393The series was made stationary using logarithmic transformation and differencing methods. It became stationary after the seasonal first difference was taken. The ACF and PACF correlogram was plotted as depicted in fig.

4, to select the suitable AR, MA, SAR, and MAR terms for the model. Fig. 4. The ACF and PACF graph for the first seasonal difference.

From the above correlograms, it was observed that both the ACF and PACF cuts the upper confidence level for the first time at lag value 0 and hence, the coefficients of both AR and MA terms would zero i.e. p=0 and q=0.

Since the ACF and PACF plot is negative at lag 12, there should be a SMA and SAR term to the model. A function was created using all possible combinations of parameters for fitting the models, the outcome was predicted using the models, and the model with the smallest MAPE was selected. The best model was found to be seasonal ARIMA(0,1,0)x(2,0,1,12) model which was used to forecast the future electricity consumption.Fig. 5 shows the forecasted electricity consumption (yellow) for the academic year 2008-09 and the actual data (blue), also tabulated in Table 3. The best model was able to forecast the consumption with a MAPE of 6.63%.

Fig. 5. Actual Data Vs Forecasted DataTable 3.

Actual data of electricity consumption Vs the forecasted data.MonthActual ForecastedJuly 2008321408304447.956August 2008335580314965.

659September 2008345156345496.454October 2008300744307510.677November 2008306600305122.

727December 2008263340297856.866January 2009280572314373.984February 2009292272335838.317March 2009347604334558.

729April 2009365376339294.089May 2009316116295065.403June 2009258036278028.1716 ConclusionsElectricity demand forecasting plays an integral role in planning for the electricity production and determine the resources needed to operate the plants such as fuels. Furthermore, it helps in planning for future electricity needs and thus establishing new plants and networks.The analysis of the electricity consumption in IIT(ISM) for the period 2004-08 gave us a seasonal ARIMA (0,1,0)x(2,0,1,12) model as the best model and it was able to forecast the consumption for year 2008-09 with a MAPE of 6.

63% References1. Navani, J. P., N. K. Sharma, and Sonal Sapra. “Technical and non-technical losses in power system and its economic consequence in Indian economy.

” International Journal of Electronics and Computer Science Engineering 1.2 (2012): 757-761.2.

Azadeh, Ali, and Z. S. Faiz. “A meta-heuristic framework for forecasting household electricity consumption.” Applied Soft Computing 11.1 (2011): 614-620.3.

Dzobo, O., et al. “Multi-dimensional customer segmentation model for power system reliability-worth analysis.” International Journal of Electrical Power & Energy Systems 62 (2014): 532-539.4. Min, Brian, and Miriam Golden.

“Electoral cycles in electricity losses in India.” Energy Policy 65 (2014): 619-625.5. Rathod, Ravindra R.

, and Rahul Dev Garg. “Regional electricity consumption analysis for consumers using data mining techniques and consumer meter reading data.” International Journal of Electrical Power & Energy Systems 78 (2016): 368-374.6.

Chen, C. S., J. C. Hwang, and C. W. Huang.

“Application of load survey systems to proper tariff design.” IEEE Transactions on power Systems 12.4 (1997): 1746-1751.7. Benítez, Ignacio, et al. “Dynamic clustering segmentation applied to load profiles of energy consumption from Spanish customers.” International Journal of Electrical Power & Energy Systems 55 (2014): 437-448.