Recent version of Prophet, which is the underlying engine of Exploratory’s Time Series Forecasting Analytics View, includes national holidays data of various countries in itself, and just by specifying code of a country, you can make a forecast which takes the effect of the coutry’s holidays into account.
In this Note, I will show you how to make such a forecast with example.
Here is Data for Wikipedia page views for the page Barbecue Source. I’m expecting it might be affected by national holidays, since holidays are good occasions for barbecue parties.
On line chart, it looks like this.
After loading the data, go to Analytics View, and select Time Series Forecasting as the Type. Then select input columns for the Analytics like following.
Let’s make a forecast for 90 days into the future. Open Analytics Properties Dialog by clicking the gear icon like the screenshot, and set 90 as Forecasting Time Period.
Now, let’s set “US” as Countries for Holidays. You can also specify multiple countries by separating the country codes by comma. List of supported countries and their codes can be seen by clicking the question mark icon next to the title of the input field, or at this link. Click the green Apply button to run the forecast.
The result of the forecast is displayed on the line chart under Forecasted tab. The part only with the orange line is our 90-day forecast of daily views of Barbecue Source Wikipedia page.
You can drag mouse cursor on the chart below to zoom into the forecasted part.
Click Effects tab to see what effects consist the forecasted values we came up with. The purple line is the effect of Holidays. We can see that the model considers 4th of July to have positive effect on the views of Barbecue Source page.
You can drag mouse in the chart below to zoom into it.
Under Data tab, effects of each individual holiday are available. This data can be exported from the downward arrow icon.
Let’s test the forecast by comparing the result with actual data using Test Mode.
For comparison, let’s first remove “US” from Countries for Holidays before making forecast with Test Mode.
Enable Test Mode and set 90 as the number of days to use for test data. Click the green Apply button.
The part with light blue line is the test period.
Zooming in, we can see that there is a peak on 4th of July on actual data, but our forecast was not able to predict that.
On Summary tab, metrics for quality of the forecast is reported.
RMSE, which can be interpreted as difference between forecast and actual result on average, is 56.83. We will compare this with the forecast with holiday effect later.
Now, let’s reenable holiday effect in the model by setting US to Countries for Holidays again.
Again, the part with light blue line is the test period.
Again, here is the chart you can zoom in by dragging mouse on it.
We can see that the peak on 4th of July is correctly forecasted.
On Summary tab, metrics for quality of the forecast is reported.
RMSE is 55.14 now. It improved compared to 56.8, which we had without specifying Countries for Holidays.
It seems that the holiday effect is helping the model to make a better forecast in this case.