A common question about R integration feature in Tableau 8.1 is how to use it with forecasting packages available in R. This is an interesting question for two reasons:

SCRIPT_ functions are calculated fields and calculated fields don’t add rows to the table while that’s what forecasting is all about.

Forecast packages normally return just the forecast results which is fewer rows than the input data and at the same time, doesn’t make sense to join back in.
Here is a relevant thread from Tableau forums.
http://community.tableausoftware.com/thread/137551
Before we move on any further, I would like to emphasize that Tableau has a built in forecast functionality which makes most of what we will cover here possible with the click of a button and comes with many more convenience features. The intent of the blog post is to show an alternative approach while very complicated can be useful for some.
The example in the forums uses the fpp package. For variety in this blog post I will be using the forecast package. It is worthwhile noting that, we worked very closely with Prof. Hyndman, the author of R’s forecast package when developing the forecasting feature in Tableau 8.
Let’s start with the visualizing the fit and residuals for your insample forecast. You will find this in the first sheet of the workbook I provided a link for at the end of the post.
We have two time series in this chart. To reduce the network traffic, I retrieved both columns in a single call to Rserve. Let’s take a look at the calculation.
First, we are loading the library. For this to work, please make sure the library and its dependencies are installed. You can find more about how to preload libraries for better performance in my posts here and here. Then we are creating a time series object (ts) using [Earnings] column (.arg1), since it is quarterly our deltat is 1/4 and our time series starts at the beginning of 1960. You will notice the paste function is taking two columns from the resulting forecast (fit and residuals) and concatenating them using ~ as the separator which allows us to retrieve both columns at once. Then in the calculated fields named Fit and Residuals, we tokenize the results. Only one of the calculated fields is shown below for brevity.
So far we’ve been dealing with the insample forecast, so we haven’t yet touched the core question about how to retrieve the future predictions into Tableau. Without further ado, let’s jump to the more exciting part. I will demonstrate two approaches. The first one is domain shifting which is shown in the second sheet.
Our calculation as shown above is very similar to the one we looked at earlier. But this time we are asking for a forecast (mean) of length (h=) set by the Tableau parameter named [PeriodsToForecast] and using append function to add the predicted rows to the end of the [Earnings] column (which answers the question 2 above) while dropping just as many rows from the top. The result is shown below. Since we forecast 10 quarters, our time series now extend 2.5 year beyond 1980 but our chart doesn’t start at 1960 anymore.
We achieve this also by shifting the date axis by the same amount as shown below.
But what if you don’t want to trim rows from the beginning of your data? That’s where densification comes to our aid.
The second approach relies on the fact that Tableau can fill in the missing values given a range. What we will do is to shift the last data point to the date we want our forecast to end. This will create a gap between the last value and the previous and Tableau’s “Show Missing Values” option will automatically add rows to fill that gap.
The last date is moved out to provide the full date extent for forecasting using the following calculated field. Here [MaxDate] is created at the data source level by joining the main table with a subquery that returns the MAX(Quarter) in the input data.
Our calculated field looks very similar to the previous calculation. Important changes are highlighted which involve removing the blank rows densification added from the data passed on to R’s forecast function as input and pulling back the shifted last data point to the correct date for the forecasting function. Then the results get merged back in using the same append function but this time without anything getting trimmed.
The result is shown below. For comparison we are forecasting 10 quarters (same as above). Now our time series extend 2.5 year beyond 1980 while chart still starts at 1960.
You can download the example workbook from HERE.
Probably the best articles that I have seen on R and tableau connectivity.
This has been incredibly helpful in trying to get a valuable forecast. I’m not a star with R, what would be your suggestion for getting this to work with hourly data, as opposed to quarterly. I couldn’t change the deltat to be a very small decimal.
myTimeSeries = ts(mydata, frequency = 24)
should do the trick.
Really helpful. I am struggling to find a way to pull out the upper and lower values for each of the forecasted data points. I have tried simply replacing the $mean with $upper but the function expected fewer values than the script returned.
Sir, i have one month data …i have to forecast for next month..what should be deltat value?? is it work ….
SCRIPT_REAL(“library(forecast);jjearnts < ts(.arg1,deltat=1440,start=c(2006, 1));fcast < forecast(jjearnts, h=.arg2[1]);
n<length(.arg1); append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n.arg2[1])",SUM([earning]),[Parameter]) ………….(my starting data is from 01042006 end date is 31012015…..corresponding to this date i have daily earning data…………but in forcastion i m getting FLAT DATA 😦 ……….pls suggest me
Please try:
frequency = 365, start = c(2006,90)
April 1st is the 90th day of the year.
If you only have one month of input data, you shouldn’t try to forecast the next month based on that.You can find any trend or seasonality with a single data point. Do you mean monthly data?
Thanks a lot…awesome prediction..u r greatt…..v happy 🙂 my next step is to add dependent parameter…such as what will happen if company partner will increase ?what will be my prediction of earning …/actually i m not basically focus on earning forecast…i wanna see how to add dependency parameter?? i have searched but only find the dashboards..that looks fantastic..suggest me for the same….n thanks again Sir,,i did forecast without any mathmatical calculation…tableau is great with R
I had a query. I am getting the following error when i pull for the ‘Forecast – Domain Completion’ into the Rows section: Error in na.fail.default(as.ts(x)) : missing values in object
I can’t tell from the error. What else is in the visualization? In the example workbook, you won’t see such an error.
Thank you for this value adding article! My model seems quite reliable but when I want to forecast, my historical information also transforms. For a particular week I have a cancel ratio of let’s say 60%. When I want to forecast for let’s say 12 weeks, my historical data is transformed to lower or higher percentages, compared to the data when I am not forecasting.What am I doing wrong?
Thank you in advance.
Best
Bart
hi bora!
realy good article man.
i download the example and try to use the script on my data.
i use this script:
SCRIPT_REAL(“library(forecast);l<length(.arg1);u<.arg1[1:(l.arg2[1])];
n<length(u);u[n]=.arg1[l];jjearnts < ts(u, delta = 1/12,start=c(2014,1));
fcast < forecast(jjearnts, h=.arg2[1]);
append(u,fcast$mean, after = n)",SUM([costAfterDiscount]),[Period To Forecast])
but this script only change the values of my last periods (by "period to forecast") that i already have data for them.
i want to forecast periods that i dont have values for them but i think that the append doesnt work good for me.
thank you!
Hi Bora,
this is an awesome article. i am fairly new to R and connecting to R from tableau and found this article very helpful. i had a quick question regarding the forecast values, do you know if there is a way to get the predictions/forest for the development data as well.. in the above scenario, i am looking for the orange line (predicted data) for Q’s in 1960 through 1980 as well to check how the model performed??
thank you for your time and expertise…
Hi Bora,
I keep getting Unexpected number of results returned by SCRIPT function. Function expected 5206 values; 2268 values were returned.
I saw on tableau’s site the troubleshoot is: The R script result must be either a scalar or vector of length one that is replicated for all rows, or a vector of length equal to the number of rows in the Tableau result table.
But I can’t make much sense of it. Can you help?
Here is my code:
SCRIPT_REAL(”
fit < lm(.arg1 ~ .arg2 + .arg3)
fit$fitted
"
,
SUM([Applications (2015)]),
SUM([Applications (2014)]),
SUM([Applications (2013)])
)
I’m also gettting the same error whenever the arrgument i passed to tableau is a calculated field of first difference. how can we fix this? Would really apppreciate any insights from you, Bora..
Hi Bora,
I have started learning forecasting in R and Tableau. I was reading your post and I could not understand what actually fcast$mean is doing here?
Any help is really appreciated.
Thank you.
Excellent Article – trying to utilize method two to improve upon Tableau’s automatic forecasting capabilities. Could you explain in more detail how you created the parameter MaxDate? I’m not sure how to create a parameter at the Data source level from within Tableau desktop.
Thanks again!
Pingback: How To Forecast With Tableau And R  Data Science Riot!
Awesome post.
Is there a way to create [MaxDate] field with a calculated field instead of a sub query?
If you have Tableau 9.0 or later you can use an LOD expression. {MAX([Quarter])}
Hi, I have 30 days data from 04/18/2016 to 05/19/2016, I want to predict for next 2 or 3 days, when I am using the following script I am getting the error as follows.
Error – Error in stl(x, s.window = s.window, t.window = t.window, robust = robust) : series is not periodic or has less than two periods.
script – SCRIPT_REAL(“library(forecast);jjearnts < ts(.arg1, frequency = 365,start=c(2016,107));fcast < forecast(jjearnts, h=.arg2[1]);
n<length(.arg1); append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n.arg2[1])",MAX([x]),[PeriodsToForecast]).
Your series is too short. Can you try using 7 as frequency instead of 365?
Hello Bora,
The following code works in R but it doesn’t in Tableau.
Error: An error occurred while communicating with the RServe service.
Error in seq.default(l + 1, l + n) : ‘to’ must be of length 1
l < as.numeric(length(.arg1));
n < .arg2;
LREVENUE < log(.arg1);
timeperiod < seq(l);
timeperiod < as.numeric(timeperiod);
fit1 = lm(LREVENUE~timeperiod);
Intercept < fit1$coefficients[[1]];
Slope < fit1$coefficients[[2]];
timeprd = seq(l+n);
timeprd < as.numeric(timeprd);
trend < Intercept + Slope*timeprd;
ts = data.frame(timeprd, trend);
trnd < fit1$fitted.values;
remaining < LREVENUE/trnd;
seasonal_factor < c();
for (i in seq(n)) {
v = mean(remaining[seq(i,l,n)])
seasonal_factor[i] < v[1]
}
sf < ts$trend[seq(l+1,l+n)]*seasonal_factor;
sf < exp(sf);
actualrevenue < exp(LREVENUE);
actualrev < actualrevenue[seq(n+1, l)];
rev < append(actualrev, sf);
rev
I was trying to fix the issue for the past week and I was not able to find the reason, could you please help me ?
Thanks in Advance!
Hi Bora,
Is there any way I can display summary table based on forecast ? Above example does quarterly forecast, I need yearly summary below the line chart in a dashboard view.
Thanks in advance !
Regards,
Gana
Of course you can. If you open the sample workbook in Tableau and select text table option in show me, this should just work.
I need the table to be aggregated one level above. The forecast numbers are at quarterly level, need the numbers in table at yearly level. The moment I change the column ‘Shifted for domain completion’ to ‘Year’ it throws error, which is expected due to the nature of R code. Wondering if there are any workarounds to get yearly summary based on quarterly projection…..
Can we plot R visualization in tableau…?
If you write it to a file, you can display it in a window inside a dashboard by pointing to its URL.
Hello Bora.., Thanks for quick response. sorry i couldn’t make the point of the statement can you please elaborate..?
My attempt, for example, in SCRIPT_* function, using hist(x) function of R to plot it on tableau, but it is not happening…
Hi Bora, I really like this blog!
It’s a great breakdown of how to integrate R with Tableau and create a usable forecast Measure in Tableau. This is a great improvement to the Forecasting that is in the Analytics section as you can actually use the Measure in other calculations.
I was wondering if you could help me get the append/densification part to work.
I have used the same script as in your example but in both the 1st and 2nd append scripts I receive the error ‘result returned by SCRIPT function is of unexpected type’.
I’ve tried a few different variations of the script, but it doesn’t seem to work.
I also think that in the code the ‘l’ look for length(.arg1) looks a lot like a 1, maybe you could rename it len< length(.arg1)
Thanks, Phil
Would you be able to share the exact SCRIPT_ you’re using here? It would be good to see what type of object your end result is.
Hello bora,
does .arg1 automatically gets the first column of your dataset? Thanks
The first argument you pass to the SCRIPT_ functions. Similarly .arg2 would be the second field you provide to the SCRIPT_ function.
Hello,how can i make my forecast display monthly values for the next 5 year?
I tried making the frequency value into 1/12 but still, it is displaying yearly values.. How can I fix that?
Hello Bora,
Need your help, please let me know where I am facing problem, trying very hard to find out this error.
“Error : Expected 875 values 874 values were returned” this similar type error always facing while returning values from R script not able to figure out any issue .Please help here is the attached code.
SCRIPT_REAL(”
library(forecast);
arg1<na.omit(.arg1);
time < ts(arg1,start=c(2013,1),frequency=1);
fcast<forecast(time,h=.arg2[1]);
fit < auto.arima(time)
f<forecast(fit, 2)
n<length(arg1);
#append(arg1[(.arg2[1]+1):n],fcast$mean,after=(.arg2[1]))
append(arg1[(.arg2[1]+1):n],f$mean,after=(.arg2[1]))
",
COUNTD([Bug ID]),[Periods to forecast])
Please follow the example in the workbook. Based on this script it is hard to tell what is going wrong since I can’t see the data but it looks like you took the example script and modified parts of it.
For example you forecast out 2 periods instead of .arg2[1] periods. If .arg2 does not equal 2 this would cause an error.
f<forecast(fit, 2)
Then you're supposed to append it after n.arg2[1] but instead you're appending at .arg2[1].
append(arg1[(.arg2[1]+1):n],f$mean,after=(.arg2[1]))
If you just follow the example this should work without any problems.
Hi Bora,
Thank you so much for your help:) It really worked out with your example. Now at least we are able to see forecast lines in Tableau. But we are facing the problem where forecasting is happening only till the year which is available in data, i.e it is showing only till 2017 ,even though we change period of forecast the forecasting is happening for previous years , Ex : forecast period=2 , its forecasting for 2016,2017…forecast period=3 , its forecasting for 2015,2016,2017 but not 2018,2019 not able to see 2018/2019 in plot of Tableau. It would of great help if you can suggest. Once again a loads of Thanks for help.
Thanks,
Vinay Haritsa
Please take a look at the example workbooks in this post. It already addresses this problem.
Hi Bora,
Thank you for your help and response. As you mentioned your example covers everything, by analysing calculative field shifteddate I analysed my issue can be resolved. Its great your guidance helped a lot, Now we are figuring out that our predictions is flat line ? its not incresing or decreasing . We are working on the same.Thank you once again.
Thanks,
Vinay Haritsa
Hello, thank you for your example, I’m having the same problem as Vinay the forecasting is happening only till the year which is available in my source data, I looked to your workbook and tried to understand and I don’t know if I really do. I added a shifted date from 2012 to 2020 but the forecasting looks like flat line and begin from 2014 however i have dates until 2018, besides the actual shows only one point. Does anyone have an idea ?
Thank you
I solved the problem by using the calculated field shifted date and replacing the third parameter of the function DATEADD by DATETRUNC(‘quarter’,[Date]) to trunc the quarter from the existing date on my source.
Hello Bora,
I used this script in the Tableau:
SCRIPT_REAL(“library(forecast);
time < ts(.arg1,start=c(2013,1), frequency=24);
fcast < forecast(time, h=.arg2[1]);
n<length(.arg1);
append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n.arg2[1])",
SUM([Number of Records]),[Periods to Forecast])
but always have the problem with current month, e.g. now with May. For my purpose it should be forecasted until June. I have changed frequency to =24 as I don't have any forecast with frequency=12.
So, what should I have to change in the script to get appropriate forecast?
Or maybe, the better question would be, how to exclude current month (May) from Actuals and put it in Forecast?
Many thanks!
P.S. Don't know how to post the picture to see what's going on with the forecast.
Frequency 12 indicates a yearly (12 month cycle). If you enter 24 that means it is daily data (24 hour). If you want to forecast 24 months out you need to update the value of the parameter [Periods to Forecast]. Deltat is an alternative way of doing this which is what I used in my examples but yields the same result e.g. you can do deltat=1/4 instead of frequency=4 for quarterly data.
If you download the sample workbook, I think you will find the answer in the sheet that shows how to shift dates for domain completion. The code is slightly different
and you will need to use a date field in the opposing axis that shifts the last date you have N periods out (e.g. 24 months in your case). This way you will see the dates extending into the future while actuals remain untouched.
Actually frequency is this:
Data Frequency
Annual 1
Quarterly 4
Monthly 12
Weekly 52
I am trying to figure out these scripts and everything else regarding forecasting with R but I am still beginner and can’t catch everything at this moment. Still don’t understand some things.
I have tried to implement your whole example from https://boraberan.wordpress.com/2014/01/19/usingrforecastingpackagesfromtableau/ into my Tableau but, again, without success.
My date has format ” 1/8/2015 5:35:00 PM” and I suppose this could be one of the problem that I am facing right now.
I have used your script:
SCRIPT_REAL(“library(forecast);jjearnts < ts(.arg1,deltat=1/12,start=c(2013,1));fcast < forecast(jjearnts, h=.arg2[1]);
n<length(.arg1); append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n.arg2[1])",SUM([Number of Records]),[Periods to Forecast])
and created calculated field for date: DATEADD('month', [Periods to Forecast], [Entered Date Time]) because I don't have quarter in my database.
also, created field for Pred_vs_Actuals: IF INDEX() < SIZE() – [Periods to Forecast]
THEN 'Actual'
ELSE 'Forecasted'
END
and when I put this in the Tableau I got FLAT line of forecast…
Btw, what does mean "jjearnts" in the script?
Many thanks.
Sample data is earnings for Johnson&Johnson, the data is converted to an R time series so I named it jjearnts. The name is not important. Date format is also not important since code doesn’t pass dates to R. It just tells R the date of the first point e.g. 1st month of 2013 and distance between points e.g. month (1/12) meaning 1st point will be Jan 2013, second point will be Feb 2013. R itself is generating a time series object based on this information as opposed to getting dates from Tableau. No error but flat line could mean that R didn’t find a good fitting model for your data.
Did you try Tableau’s forecasting and see how it does?
Yes, I tried Tableau’s forecast models (Automatic, NoneNone, NoneMultiplicative etc.) and they work very well. Maybe I don’t have enough long time series for this as I have data from 2013.
It’s not a problem to use Tableau’s forecast but I would like to create one, the best forecast model otherwise I have to select between Tableau models every time and it takes me a lot of time.
Thank you very much Bora!
Automatic will pick the best model for you. You really don’t need to select between models.
Yes, that’s true, but automatic model doesn’t give me any seasonality as I need it for my purpose.
Thank you very much Bora for help.
What do you expect the seasonal cycle to be in your data?
Tableau looks for common cycles in data where it recognizes the time domain e.g. it would look for 12 month cycles in monthly data, 24 hour cycles in hourly data etc.
If you convert the axis from date to integer it would search for any possible random seasonal cycle e.g. do a datediff to convert date to months since the first date etc. instead of using month of date.
You might want to give it a shot.
Hi Bora, thanks for all your work on this! I actually was doing some digging related to a Tableau Conference 2017 video on R and Forecasting, and stumbled upon your site!
Imagine my surprise when I discovered that you had written two extensive blog posts containing EXACTLY the content discussed in the second half of the talk:
http://tclive.tableau.com/Library/Video?vCode=17BI038
Thank you so much for the detailed writeup – I was able to figure out the issue I was having and was able to understand the process much better using your posts.
Thank you!
Best,
Zack
Great post, than you for sharing this knowledge. I have a question: is there any way to show only the last “actual” value label and the last “forecast” value label on the chart? So I can easily have an idea of current vs. forecast number.
Why does Tableau tell me the following when I try to open your workbook?
An error occurred while communicating with the RServe service.
Error : package or namespace load failed for ‘forecast’ in loadNamespace(j < i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): there is no package called 'fracdiff'
Hi Bora…its a great post and I have a quick question. I am using a calculated field to pass it to the ARIMA for calculating the forecast. And for some reasons I am not getting the output properly. For all the forecast periods, it is just giving the one output. And when I am passing it as pre calculated field i.e. as part of data source, then it is working fine. Do you know how we can resolve such issue. Also another issue is that when getting the output, forecast for last data period is plotted on first ‘period to forecast’ point. Sorry if I am confusing you too much.