Week 12 R-lab
Zhaohu(Jonathan) Fan
07/27/2021
Our goal is to use the data set ‘books.csv’ to forecast the next four day’s sales for paperback books.
Step 1: Load packages used
Step 2: Import and convert paperback to time series object
setwd("C:/Users/fanzh/OneDrive - University of Cincinnati/UC_couse/000_Teaching_4090_SS21/Labs/Week12")
books <- read_csv("books.csv")
head(books,5)
## # A tibble: 5 x 3
## Day Paperback Hardcover
## <dbl> <dbl> <dbl>
## 1 1 199 139
## 2 2 172 128
## 3 3 111 172
## 4 4 209 139
## 5 5 161 191
Note: Use of “[ ]” for subsetting and indexing
Step 3: Plot the paperback series
Step 4: Partition the data so that the last four observations are used for your test data
library(forecast)
# partition
paper_train <- subset(paper.ts,end = length(paper.ts)- 4)
paper_test <- subset(paper.ts,start = length(paper.ts) - 3)
paper_train
## Time Series:
## Start = 1
## End = 26
## Frequency = 1
## [1] 199 172 111 209 161 119 195 195 131 183 143 141 168 201 155 243 225 167 237
## [20] 202 186 176 232 195 190 182
## Time Series:
## Start = 27
## End = 30
## Frequency = 1
## [1] 222 217 188 247
Step 5: Use simple exponential smoothing and explore different values of \(\alpha\) (.1, .2, .3, etc). Record the test prediction RMSE for the forecasts. How does \(\alpha\) affect the forecasts?
Perform SES with alpha = .1, .2, .3
Step 6: Assess prediction accuracy:
## ME RMSE MAE MPE MAPE MASE
## Training set 5.942452 34.13232 26.96938 -0.4295321 15.61521 0.6629641
## Test set 29.104802 35.86558 29.80240 12.4969025 12.86797 0.7326057
## ACF1 Theil's U
## Training set -0.1160565 NA
## Test set -0.4716847 0.9675242
## ME RMSE MAE MPE MAPE MASE
## Training set 4.490054 34.26272 28.43388 -1.082746 16.42398 0.6989645
## Test set 24.740097 32.42410 27.62005 10.480351 12.01224 0.6789589
## ACF1 Theil's U
## Training set -0.2169554 NA
## Test set -0.4716847 0.8866256
## ME RMSE MAE MPE MAPE MASE
## Training set 2.685266 35.06664 29.87898 -2.080051 17.34698 0.7344883
## Test set 25.458492 32.97552 27.97925 10.812259 12.15309 0.6877887
## ACF1 Theil's U
## Training set -0.2737697 NA
## Test set -0.4716847 0.8995712
Step 7: Let’s automate this process and assess alpha 0.01 - 0.99:
# First, I create a vector of values from 0.01 to 0.99 with step 0.01
alpha <- seq(0.01, 0.99, 0.01)
Initialization of variable RMSE with a single value NA.
Iterate through all values of vector alpha and assess prediction accuracy
for(i in seq_along(alpha)) {
fcast <- ses(paper_train, alpha = alpha[i], h = 4)
RMSE[i] <- accuracy(fcast, paper_test)[2, 2]
}
# turn our RMSE values for alpha .01-.99 into a data frame
error <- data_frame(alpha, RMSE)
minimum <- filter(error, RMSE == min(RMSE))
# plot RMSE values for alpha .01-.99
ggplot(error, aes(alpha, RMSE)) +
geom_line() +
geom_point(data = minimum, color = "blue", size = 2) +
ggtitle("alpha's impact on simple exponential smoothing forecast errors",
subtitle = "alpha = 0.22 minimizes RMSE")
Step 8: Now let the R select the optimal alpha value. Use this value to generate forecasts for the next four days. Compare results with auto select model:
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = paper_train, h = 4)
##
## Smoothing parameters:
## alpha = 0.1351
##
## Initial states:
## l = 171.713
##
## sigma: 35.4645
##
## AIC AICc BIC
## 274.1931 275.2840 277.9674
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 5.740754 34.0732 27.41353 -0.4571133 15.81877 0.6738824 -0.1616501
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 27 191.8781 146.4285 237.3277 122.3690 261.3873
## 28 191.8781 146.0156 237.7406 121.7375 262.0188
## 29 191.8781 145.6064 238.1498 121.1116 262.6446
## 30 191.8781 145.2008 238.5555 120.4913 263.2650
# optimal alpha for prediction accuracy (based on above for loop results)
ses.optimal <- ses(paper_train, alpha = .22, h = 4)
accuracy(ses.auto, paper_test)
## ME RMSE MAE MPE MAPE MASE
## Training set 5.740754 34.07320 27.41353 -0.4571133 15.81877 0.6738824
## Test set 26.621870 33.88176 28.56094 11.3497551 12.38117 0.7020879
## ACF1 Theil's U
## Training set -0.1616501 NA
## Test set -0.4716847 0.9208597
## ME RMSE MAE MPE MAPE MASE
## Training set 4.085371 34.38411 28.72038 -1.299844 16.60708 0.7060073
## Test set 24.654625 32.35893 27.57731 10.440862 11.99548 0.6779084
## ACF1 Theil's U
## Training set -0.2300195 NA
## Test set -0.4716847 0.885096