Time Series Clustering in Tableau using R

Clustering is a very common data mining task and has a wide variety of applications from customer segmentation to grouping of text documents. K-means clustering was one of the examples I used on my blog post introducing R integration back in Tableau 8.1. Many others in Tableau community wrote similar articles explaining how different clustering techniques can be used in Tableau via R integration.

One thing I didn’t see getting much attention was time series clustering and using hierarchical clustering algorithms. So I thought it might be good to cover both in single post.

Let’s say you have multiple time series curves (stock prices, social media activity, temperature readings…) and want to group similar curves together. Series don’t necessarily align perfectly, in fact they could be even about events that are happening at different pace.

Time series clustering in Tableau using R

There are of course many algorithms to achieve this task but R conveniently offers a package for Dynamic Time Warping. Below is what my calculated field in Tableau looks like.

Calculation for dynamic time warping in Tableau

I started by loading the dtw package, then converted my data from long table format (all time series are in the same column) to wide table format (each series is a separate column).

I then computed distance matrix using dtw as my method and applied hierarchical clustering with average linkage. This gives a tree which I then prune to get the desired number of clusters.

And finally, data is converted back into a long table before being pulled back into Tableau.

Screenshot is from Tableau 10, but don’t worry if you’re not on the Beta program. You can download Tableau 8.1 version of the workbook from HERE. Enjoy!

Going 3D with Tableau

Even though they are not the type of visuals that first come to mind when you talk about business intelligence, if your background is in science or engineering like myself at some point in your career you probably did 3D charts. And by that I don’t mean 3D bar or pie charts. I mean scatter plots, contours, mesh surfaces… where 3rd dimension adds meaningful information.

If you wanted to do these types of charts in Tableau, how would you do them?

I thought it would be good to put together a few examples starting from the simplest and moving to more advanced visualizations. Let’s start with the scatter plot below.

Four clusters in a 3D scatter plot

Achieving this takes a background image (the cube image to emphasize the feeling of depth) and 3 simple calculated fields that re-project x,y,z coordinates to x_rotated and y_rotated fields. Wikipedia has a great summary if you’re curious where these formulas come from. You can use z_rotated field to set the size of the marks to give perspective effect to also to set the sort order of the marks so you get the z-order right. You can find this chart as a Tableau Public visualization HERE if you’d like to take a closer look or use it as an example for your own 3D scatter plots.

The example above shows rotating in 2 directions (dashboard exposes one control but you can download and open the sheet to see both options) but what if you wanted to be able to rotate in 3 different directions? I saved this for my next example as incremental change makes it easier to follow.

Caffeine Molecule (click to view Tableau Public viz)

Caffeine molecule example above uses a dual axis chart. One axis draws the atoms while the other draws the bonds between them.  It still relies purely on the background image, mark size and z-order to achieve the 3D look just like the previous example but  you will notice that, this time “size” has a multiplier to account for the size of the atom in addition to the distance in 3D space to represent the relative sizes of different atoms. Another thing you’ll notice is that now that you’re using dual axis, z-order doesn’t always work well since Tableau sorts within each axis independently so in this setup atoms are always drawn above bonds but each group is sorted within themselves. You can find the Tableau public viz HERE or by clicking on any of the screenshots.

You can extend this basic idea to do many other types of 3D views. For example a contour plot like the one below and go from 2D contours on the left to perspective view with 3 extremely simple calculated fields.

Contour Plot (click to view Tableau Public viz)

Or 3D meshes that let you play with different parameters and allow you to explore results of different equations.

3D Mesh in Tableau (click to view Tableau Public viz)

To look at waves and ripples…

3D Mesh in Tableau (click to view Tableau Public viz)

Or draw filled surfaces…

Filled surface

Or doing just some viz art like downloading radio telescope data to build the album cover of Joy Division’s Unknown Pleasures album in 3D in Tableau.

Joy Divison - Unknown Pleasures (click to view Tableau Public viz)

How about also drawing axis, grids and even labels and have them rotate with the visualization instead of using a static background image and just have the marks rotate?

3D scatter plot with axis in Tableau (click to view Tableau Public viz)

Since you’re custom drawing these objects, you will need to have data points for axis extents, grid cell coordinates etc. I did this using Custom SQL. You can find data used in all example workbooks HERE as well as the SQL used. But easiest way to understand how it works is exploring the visualization. Thanks to VizQL, you can simply drag the [Type] pill from color shelf to row shelf see the visualization broken into its pieces which will help you understand how lines (surfaces, grids etc.) are handled differently from points and how it is possible to turn on/off grid using a parameter.

Breakdown of components of the visualization (click to view Tableau Public viz)

Besides Custom SQL, there is very little change to go from static background image to dynamic axis. You will see there is a minor change only in the x_rotated field’s definition.

No 3D visualization post would be complete without an example that involves 3D glasses:)

To be honest  this post didn’t originally have one but as I was wrapping up the examples, I showed them to some co-workers. Apparently two Tableau developers (Steven Case and Jeff Booth) had put together a stereoscopic sphere example and they thought it would be a nice addition to this post and they were kind enough to share it with me.  Time to put on some anaglyph glasses…

Anaglyph Stereoscopic 3D Sphere in Tableau (click to view in Tableau Public)

In this post I tried to share some examples of how to create various types of useful 3D charts in Tableau. I hope you find it useful.

Hidden gems in Tableau 9.2

Tableau 9.2 has just been released. It is full of several exciting features like Mapbox integration and being able to move totals, new permission settings as well as new iPhone app, it is easy to miss some small but welcome improvements.

Here are a few little features that I am sure some of you will appreciate.

Filtering on discrete aggregates : Throwing that ATTR(State) or a discrete MIN(Date) on the filter shelf is not a problem anymore.

Putting more in your LOD dimensionality : Before 9.2, you wouldn’t be able to use Sets, Combined Fields and Bins as dimensionality in your LOD expressions. Now you can write that {fixed [Profit (bin)] : AVG([Sales])} calculation you always wanted:)

As usual, we are looking forward to your feedback on the new release!

Quick tip : Creating your own dashed line styles in Tableau

You may be writing a paper that will be published in black and white and color is not an option or you just like using different patterns. Whatever the reason might be sometimes you want to draw your line charts with dashed lines. The most common methods people use to do this in Tableau are 1) Using a dual axis chart 2) Using the Pages shelf. Both are very simple, but (2) returns much more pleasant results.

Last week, I was talking to a Tableau customer and what they asked for during the call was not achievable with either of these methods since they wanted to be able to use multiple dashed line styles in the same visualization and also display a legend for it. So I had to improvise:)

After the call I thought it would be good to put together an example and share it with everyone in case anybody else wants to do the same. You can get to the Tableau Public visualization by clicking HERE or the image below and download a copy of the workbook to take a closer look if you like. I also added the other two options I mentioned into the sample workbook.

Dashed lines in Tableau (click to see in Tableau Public)

So how does it work? Solution is simple but has to be applied with caution. The best way is to look at before/after results to make sure any sudden peaks or dips are not eliminated by adding dashed line effect. The trick is to insert NULLs in the right places to get the line effect you like. For example

IF INDEX()%5 <> 0 THEN
[Your Field Here]
END

In the example I published I used a densified axis to get a very smooth curve but it is not needed but with a smooth curve and dense set of points, the result looks much better.

If you’re doing curve fitting using R integration, you can achieve something similar inside your R script using

for (i in 1:6) result[seq(i, length(result), 24)]<-NA;

where assume result is the name of the vector you would be returning to Tableau. What this would do is to insert 6 consecutive rows of nulls into your result for every 24 rows.

Tableau Customer Conference 2015

Every year I keep asking myself. Could the conference get any better? And it does. This year, 10 thousand customers and 1 thousand Tableau employees got together for the data event of the year.

I flew to Las Vegas two days before the first day of the conference as I was part of the backstage crew and a backup speaker. After two days of rehearsing on site at the MGM Grand Arena where Manny Pacquiao and Floyd Mayweather “took stage” a few months earlier, we were ready for the big day.

Backstage the day before
The calm before the storm

Of course two days is nothing compared to the overall time spent preparing for the keynote. Even though each speaker spends roughly 10 minutes on stage, a speaker and his/her backup spend at least 100 hours looking for interesting datasets, writing the demo scripts and rehearsing them. But it is definitely worth it. A conference of this scale and “data rock stars” attending it deserve no less.

Keynote backstage
Keynote backstage

Overall, my favorite part of the keynote demos was the visual analysis section which covered the support for embedding visualizations in tooltips and connecting to spatial data files (shape files etc.) among many other smaller improvements but the biggest feature in the keynote for me was data integration (cross-database joins and unions). Since my teams work on the advanced analytics features such as clustering and multivariate outlier detection, I was very happy to see the positive reaction from the audience.

Once in a while my Photoshop skills come in handy. In addition to working on the features and co-authoring the demo, I also designed the Tableau 2075 conference announcement image that concluded the analytics section of the keynote.

Tableau Conference 2075, Kepler 452-b

One of the most exciting parts of the conference for me is the opportunity to talk to lots of Tableau users face-to-face. This year with Tableau Community Appreciation Party, it got even better. I got to meet with lots of fellow Tableau users I see and interact with on the forums and bloggers whose work I enjoy.

Analytics roundtable was very helpful in hearing detailed reactions to keynote demos and what the needs were in other similar areas. Based on these discussions, I compiled a list of topics to cover here on the blog in the next few months and was convinced that I need to join this thing called Twitter.

Delivering my session “Understanding Level of Detail Expressions” (and hearing Joe Mako’s comments on the content afterwards), being part of Matt Francis and Emily Kund’s podcast were other highlights of the event for me.  And of course, meeting (and partying with) Tableau Zen Masters was the best way I could imagine to wrap up the event. I am already looking forward to next year’s conference!

What does Tableau 9.1 mean for your LOD expressions?

Tableau 9.1 is out. You may not see much mention of this in the list of improvements/features in 9.1 but you will notice that some of your LOD calculations will run MUCH faster. As I mentioned in my LOD expression introductory blog post, Tableau 9.0 runs each LOD expression as a separate subquery. Hence as you add more LOD expressions to your sheets, you probably noticed some performance degredation.  In 9.1, Tableau will collapse the subqueries for LOD expressions that operate at the same dimensionality and filter context. Result is up to 24x performance improvement in our TPC-DS tests!

Chord diagrams and Radial trees in Tableau

Two awesome posts from Noah Salvaterra and Chris DeMartini.

First from Noah, showing how to build a chord diagram a.k.a radial network diagram in Tableau.

http://datablick.com/2015/08/27/diy-chord-diagrams-in-tableau-by-noah-salvaterra/

Chord Diagram in Tableau

This has been on my TODO list for a while along with dynamic arc diagrams.

And radial trees by Chris DeMartini http://datablick.com/2015/10/12/radial-trees-in-tableau-by-chris-demartini/

Radial Tree in Tableau

Lots of great network visualizations on Datablick lately including jump plots, trees, hive plots and Biofabric from Noah and Chris.