A common question about R integration feature in Tableau 8.1 is how to use it with forecasting packages available in R. This is an interesting question for two reasons:

SCRIPT_ functions are calculated fields and calculated fields don’t add rows to the table while that’s what forecasting is all about.

Forecast packages normally return just the forecast results which is fewer rows than the input data and at the same time, doesn’t make sense to join back in.
Here is a relevant thread from Tableau forums.
http://community.tableausoftware.com/thread/137551
Before we move on any further, I would like to emphasize that Tableau has a built in forecast functionality which makes most of what we will cover here possible with the click of a button and comes with many more convenience features. The intent of the blog post is to show an alternative approach while very complicated can be useful for some.
The example in the forums uses the fpp package. For variety in this blog post I will be using the forecast package. It is worthwhile noting that, we worked very closely with Prof. Hyndman, the author of R’s forecast package when developing the forecasting feature in Tableau 8.
Let’s start with the visualizing the fit and residuals for your insample forecast. You will find this in the first sheet of the workbook I provided a link for at the end of the post.
We have two time series in this chart. To reduce the network traffic, I retrieved both columns in a single call to Rserve. Let’s take a look at the calculation.
First, we are loading the library. For this to work, please make sure the library and its dependencies are installed. You can find more about how to preload libraries for better performance in my posts here and here. Then we are creating a time series object (ts) using [Earnings] column (.arg1), since it is quarterly our deltat is 1/4 and our time series starts at the beginning of 1960. You will notice the paste function is taking two columns from the resulting forecast (fit and residuals) and concatenating them using ~ as the separator which allows us to retrieve both columns at once. Then in the calculated fields named Fit and Residuals, we tokenize the results. Only one of the calculated fields is shown below for brevity.
So far we’ve been dealing with the insample forecast, so we haven’t yet touched the core question about how to retrieve the future predictions into Tableau. Without further ado, let’s jump to the more exciting part. I will demonstrate two approaches. The first one is domain shifting which is shown in the second sheet.
Our calculation as shown above is very similar to the one we looked at earlier. But this time we are asking for a forecast (mean) of length (h=) set by the Tableau parameter named [PeriodsToForecast] and using append function to add the predicted rows to the end of the [Earnings] column (which answers the question 2 above) while dropping just as many rows from the top. The result is shown below. Since we forecast 10 quarters, our time series now extend 2.5 year beyond 1980 but our chart doesn’t start at 1960 anymore.
We achieve this also by shifting the date axis by the same amount as shown below.
But what if you don’t want to trim rows from the beginning of your data? That’s where densification comes to our aid.
The second approach relies on the fact that Tableau can fill in the missing values given a range. What we will do is to shift the last data point to the date we want our forecast to end. This will create a gap between the last value and the previous and Tableau’s “Show Missing Values” option will automatically add rows to fill that gap.
The last date is moved out to provide the full date extent for forecasting using the following calculated field. Here [MaxDate] is created at the data source level by joining the main table with a subquery that returns the MAX(Quarter) in the input data.
Our calculated field looks very similar to the previous calculation. Important changes are highlighted which involve removing the blank rows densification added from the data passed on to R’s forecast function as input and pulling back the shifted last data point to the correct date for the forecasting function. Then the results get merged back in using the same append function but this time without anything getting trimmed.
The result is shown below. For comparison we are forecasting 10 quarters (same as above). Now our time series extend 2.5 year beyond 1980 while chart still starts at 1960.
You can download the example workbook from HERE.
Probably the best articles that I have seen on R and tableau connectivity.
This has been incredibly helpful in trying to get a valuable forecast. I’m not a star with R, what would be your suggestion for getting this to work with hourly data, as opposed to quarterly. I couldn’t change the deltat to be a very small decimal.
myTimeSeries = ts(mydata, frequency = 24)
should do the trick.
Really helpful. I am struggling to find a way to pull out the upper and lower values for each of the forecasted data points. I have tried simply replacing the $mean with $upper but the function expected fewer values than the script returned.
Sir, i have one month data …i have to forecast for next month..what should be deltat value?? is it work ….
SCRIPT_REAL(“library(forecast);jjearnts < ts(.arg1,deltat=1440,start=c(2006, 1));fcast < forecast(jjearnts, h=.arg2[1]);
n<length(.arg1); append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n.arg2[1])",SUM([earning]),[Parameter]) ………….(my starting data is from 01042006 end date is 31012015…..corresponding to this date i have daily earning data…………but in forcastion i m getting FLAT DATA 😦 ……….pls suggest me
Please try:
frequency = 365, start = c(2006,90)
April 1st is the 90th day of the year.
If you only have one month of input data, you shouldn’t try to forecast the next month based on that.You can find any trend or seasonality with a single data point. Do you mean monthly data?
Thanks a lot…awesome prediction..u r greatt…..v happy🙂 my next step is to add dependent parameter…such as what will happen if company partner will increase ?what will be my prediction of earning …/actually i m not basically focus on earning forecast…i wanna see how to add dependency parameter?? i have searched but only find the dashboards..that looks fantastic..suggest me for the same….n thanks again Sir,,i did forecast without any mathmatical calculation…tableau is great with R
I had a query. I am getting the following error when i pull for the ‘Forecast – Domain Completion’ into the Rows section: Error in na.fail.default(as.ts(x)) : missing values in object
I can’t tell from the error. What else is in the visualization? In the example workbook, you won’t see such an error.
Thank you for this value adding article! My model seems quite reliable but when I want to forecast, my historical information also transforms. For a particular week I have a cancel ratio of let’s say 60%. When I want to forecast for let’s say 12 weeks, my historical data is transformed to lower or higher percentages, compared to the data when I am not forecasting.What am I doing wrong?
Thank you in advance.
Best
Bart
Hi Bora,
this is an awesome article. i am fairly new to R and connecting to R from tableau and found this article very helpful. i had a quick question regarding the forecast values, do you know if there is a way to get the predictions/forest for the development data as well.. in the above scenario, i am looking for the orange line (predicted data) for Q’s in 1960 through 1980 as well to check how the model performed??
thank you for your time and expertise…
Hi Bora,
I keep getting Unexpected number of results returned by SCRIPT function. Function expected 5206 values; 2268 values were returned.
I saw on tableau’s site the troubleshoot is: The R script result must be either a scalar or vector of length one that is replicated for all rows, or a vector of length equal to the number of rows in the Tableau result table.
But I can’t make much sense of it. Can you help?
Here is my code:
SCRIPT_REAL(”
fit < lm(.arg1 ~ .arg2 + .arg3)
fit$fitted
"
,
SUM([Applications (2015)]),
SUM([Applications (2014)]),
SUM([Applications (2013)])
)
Hi Bora,
I have started learning forecasting in R and Tableau. I was reading your post and I could not understand what actually fcast$mean is doing here?
Any help is really appreciated.
Thank you.
Excellent Article – trying to utilize method two to improve upon Tableau’s automatic forecasting capabilities. Could you explain in more detail how you created the parameter MaxDate? I’m not sure how to create a parameter at the Data source level from within Tableau desktop.
Thanks again!
[…] Using Rob Hyndman’s forecast package in R, we can easily write the results of the fitted model at the row level, which makes visualizing the accuracy of the model much easier. Below is some simple R code using Tableau’s builtin “Superstore” data set. Along with Hyndman’s functions, I’m also borrowing Tableau methods from Bora Beran. […]
Awesome post.
Is there a way to create [MaxDate] field with a calculated field instead of a sub query?
If you have Tableau 9.0 or later you can use an LOD expression. {MAX([Quarter])}
Hi, I have 30 days data from 04/18/2016 to 05/19/2016, I want to predict for next 2 or 3 days, when I am using the following script I am getting the error as follows.
Error – Error in stl(x, s.window = s.window, t.window = t.window, robust = robust) : series is not periodic or has less than two periods.
script – SCRIPT_REAL(“library(forecast);jjearnts < ts(.arg1, frequency = 365,start=c(2016,107));fcast < forecast(jjearnts, h=.arg2[1]);
n<length(.arg1); append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n.arg2[1])",MAX([x]),[PeriodsToForecast]).
Your series is too short. Can you try using 7 as frequency instead of 365?
Hello Bora,
The following code works in R but it doesn’t in Tableau.
Error: An error occurred while communicating with the RServe service.
Error in seq.default(l + 1, l + n) : ‘to’ must be of length 1
l < as.numeric(length(.arg1));
n < .arg2;
LREVENUE < log(.arg1);
timeperiod < seq(l);
timeperiod < as.numeric(timeperiod);
fit1 = lm(LREVENUE~timeperiod);
Intercept < fit1$coefficients[[1]];
Slope < fit1$coefficients[[2]];
timeprd = seq(l+n);
timeprd < as.numeric(timeprd);
trend < Intercept + Slope*timeprd;
ts = data.frame(timeprd, trend);
trnd < fit1$fitted.values;
remaining < LREVENUE/trnd;
seasonal_factor < c();
for (i in seq(n)) {
v = mean(remaining[seq(i,l,n)])
seasonal_factor[i] < v[1]
}
sf < ts$trend[seq(l+1,l+n)]*seasonal_factor;
sf < exp(sf);
actualrevenue < exp(LREVENUE);
actualrev < actualrevenue[seq(n+1, l)];
rev < append(actualrev, sf);
rev
I was trying to fix the issue for the past week and I was not able to find the reason, could you please help me ?
Thanks in Advance!
Hi Bora,
Is there any way I can display summary table based on forecast ? Above example does quarterly forecast, I need yearly summary below the line chart in a dashboard view.
Thanks in advance !
Regards,
Gana
Of course you can. If you open the sample workbook in Tableau and select text table option in show me, this should just work.
I need the table to be aggregated one level above. The forecast numbers are at quarterly level, need the numbers in table at yearly level. The moment I change the column ‘Shifted for domain completion’ to ‘Year’ it throws error, which is expected due to the nature of R code. Wondering if there are any workarounds to get yearly summary based on quarterly projection…..