Simple Exponential Smoothing in R

Week 12 R-lab
Zhaohu(Jonathan) Fan

07/27/2021

Forecasting Method(s)

How It Differs

How It Differs (cont’d)

How It Differs (cont’d)

The Equation

Choosing the Smoothing Constant

Differencing Summary

Data Example (Let’s practice!)

Our goal is to use the data set ‘books.csv’ to forecast the next four day’s sales for paperback books.

Step 1: Load packages used

library(tidyverse)
library(fpp2)

Step 2: Import and convert paperback to time series object

setwd("C:/Users/fanzh/OneDrive - University of Cincinnati/UC_couse/000_Teaching_4090_SS21/Labs/Week12")
books <- read_csv("books.csv")
head(books,5) 
## # A tibble: 5 x 3
##     Day Paperback Hardcover
##   <dbl>     <dbl>     <dbl>
## 1     1       199       139
## 2     2       172       128
## 3     3       111       172
## 4     4       209       139
## 5     5       161       191
paper.ts <- ts(books["Paperback"], start = 1, frequency = 1)

Note: Use of “[ ]” for subsetting and indexing

Data Example (cont’d)

Step 3: Plot the paperback series

autoplot(paper.ts) + geom_smooth(method = "lm", linetype = "dashed", se = FALSE, size = .5)

Data Example (cont’d)

Step 4: Partition the data so that the last four observations are used for your test data

library(forecast)
# partition
paper_train <- subset(paper.ts,end = length(paper.ts)- 4)
paper_test <- subset(paper.ts,start = length(paper.ts) - 3)

paper_train 
## Time Series:
## Start = 1 
## End = 26 
## Frequency = 1 
##  [1] 199 172 111 209 161 119 195 195 131 183 143 141 168 201 155 243 225 167 237
## [20] 202 186 176 232 195 190 182
paper_test
## Time Series:
## Start = 27 
## End = 30 
## Frequency = 1 
## [1] 222 217 188 247

Data Example (cont’d)

Step 5: Use simple exponential smoothing and explore different values of \(\alpha\) (.1, .2, .3, etc). Record the test prediction RMSE for the forecasts. How does \(\alpha\) affect the forecasts?

Perform SES with alpha = .1, .2, .3

ses.1 <- ses(paper_train, alpha = 0.1, h = 4)
ses.2 <- ses(paper_train, alpha = 0.2, h = 4)
ses.3 <- ses(paper_train, alpha = 0.3, h = 4)

Data Example (cont’d)

Step 6: Assess prediction accuracy:

accuracy(ses.1, paper_test)
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set  5.942452 34.13232 26.96938 -0.4295321 15.61521 0.6629641
## Test set     29.104802 35.86558 29.80240 12.4969025 12.86797 0.7326057
##                    ACF1 Theil's U
## Training set -0.1160565        NA
## Test set     -0.4716847 0.9675242
accuracy(ses.2, paper_test)
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set  4.490054 34.26272 28.43388 -1.082746 16.42398 0.6989645
## Test set     24.740097 32.42410 27.62005 10.480351 12.01224 0.6789589
##                    ACF1 Theil's U
## Training set -0.2169554        NA
## Test set     -0.4716847 0.8866256
accuracy(ses.3, paper_test)
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set  2.685266 35.06664 29.87898 -2.080051 17.34698 0.7344883
## Test set     25.458492 32.97552 27.97925 10.812259 12.15309 0.6877887
##                    ACF1 Theil's U
## Training set -0.2737697        NA
## Test set     -0.4716847 0.8995712

Data Example (cont’d)

Step 7: Let’s automate this process and assess alpha 0.01 - 0.99:

# First, I create a vector of values from 0.01 to 0.99 with step 0.01
alpha <- seq(0.01, 0.99, 0.01) 

Initialization of variable RMSE with a single value NA.

RMSE <- NA

Iterate through all values of vector alpha and assess prediction accuracy

for(i in seq_along(alpha)) {
  fcast <- ses(paper_train, alpha = alpha[i], h = 4)
  RMSE[i] <- accuracy(fcast, paper_test)[2, 2]
}

# turn our RMSE values for alpha .01-.99 into a data frame
error <- data_frame(alpha, RMSE)
minimum <- filter(error, RMSE == min(RMSE))

# plot RMSE values for alpha .01-.99
ggplot(error, aes(alpha, RMSE)) +
  geom_line() +
  geom_point(data = minimum, color = "blue", size = 2) +
  ggtitle("alpha's impact on simple exponential smoothing forecast errors",
          subtitle = "alpha = 0.22 minimizes RMSE")

Data Example (cont’d)

Step 8: Now let the R select the optimal alpha value. Use this value to generate forecasts for the next four days. Compare results with auto select model:

ses.auto <- ses(paper_train, h = 4)
summary(ses.auto)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper_train, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.1351 
## 
##   Initial states:
##     l = 171.713 
## 
##   sigma:  35.4645
## 
##      AIC     AICc      BIC 
## 274.1931 275.2840 277.9674 
## 
## Error measures:
##                    ME    RMSE      MAE        MPE     MAPE      MASE       ACF1
## Training set 5.740754 34.0732 27.41353 -0.4571133 15.81877 0.6738824 -0.1616501
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 27       191.8781 146.4285 237.3277 122.3690 261.3873
## 28       191.8781 146.0156 237.7406 121.7375 262.0188
## 29       191.8781 145.6064 238.1498 121.1116 262.6446
## 30       191.8781 145.2008 238.5555 120.4913 263.2650
# optimal alpha for prediction accuracy (based on above for loop results)
ses.optimal <- ses(paper_train, alpha = .22, h = 4)

accuracy(ses.auto, paper_test)
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set  5.740754 34.07320 27.41353 -0.4571133 15.81877 0.6738824
## Test set     26.621870 33.88176 28.56094 11.3497551 12.38117 0.7020879
##                    ACF1 Theil's U
## Training set -0.1616501        NA
## Test set     -0.4716847 0.9208597
accuracy(ses.optimal, paper_test)
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set  4.085371 34.38411 28.72038 -1.299844 16.60708 0.7060073
## Test set     24.654625 32.35893 27.57731 10.440862 11.99548 0.6779084
##                    ACF1 Theil's U
## Training set -0.2300195        NA
## Test set     -0.4716847  0.885096

Data Example (cont’d)

Step 9: Plot the forecasted values with the alpha that minizes the prediction error:

autoplot(ses.optimal) +
  autolayer(ses.optimal$fitted)