Visualization

# Dynamic Network Graph Layouts in Tableau using R

While history of graph theory goes back to Leonhard Euler’s paper on “Seven Bridges of Königsberg” published in 1736, until the rise social networks, analyzing graphs and use of graphs in visualization has been a niche area mostly used in academic circles.

Well..Now that you have access to all this data, how do you visualize it in Tableau?

In Tableau it is relatively easy to generate a network graph using dual axis charts with line and circle/shape mark types once you know the layout. But when you have a graph probably two important questions are :

• How to do the layout
• How to calculate relevant metrics from the graph (e.g. betweenness, closeness centrality, authority scores…)

Especially if your graph is changing over time which is very likely, you would like to do these on the fly. Using the R integration feature, this is a fairly easy task. Below is an animation showing the layout changes as new vertices get added to the graph done using the workbook you can find in this blog post.

In this example we will be using igraph package which offers a variety of layout options to choose from as well as a handful of functions to calculate some interesting metrics.

To use the mark type Line in Tableau, you need to provide a list of points that defines a path. In this case since we are connecting vertices using straight lines, we only need two points (from/to) shown with order 1 and 2. Since our data is in the structure User : Person A, Retweeted by : Person B, the names will alternate e.g. first point of the line is associated with Person A while the other end Person B. You can see this structure being built in Custom SQL.

Once the structure is in place the calculated field Graph Nodes contains the R code to retrieve X – Y coordinates to display in the chart as well as the metric shown in the tooltip (in this case I used betweenness centrality). Everything is retrieved in a single call from R then decomposed into three parts locally.

However R is expecting the data in a different format than it is in Tableau. Tableau has 2 rows for each edge (from/to). R library on the other hand is expecting the edge list as two columns so we filter out the extra rows as we’re passing them to R first in the section of the script shown below.

mydf <-mydf[(mydf\$Order=='1')

Also from R we are getting a list of X – Y coordinates for each vertex (which is the union of the two vectors we are passing to R) while each vertex is linked to multiple other vertices so appear as many more rows in Tableau. So first we merge the coordinates with the unique list of vertex names :

cbind(coords, data.frame(users=V(mygraph)\$name));

and then at the end do a join using these names in R to replicate the rows so Tableau gets back two rows to define each edge. The join happens in the following part of the script

c<-join(allusers, c, by = 'users');

I tied the layout algorithm choice to a Tableau parameter to make it easier to try different options. Even though the R script passes weights (number of times Tweeted), not all layout algorithms take this into account. If you want force directed layout and see how these weights/forces affect the results, Fruchterman-Reingold would be the option to use.

Standard

## 37 thoughts on “Dynamic Network Graph Layouts in Tableau using R”

1. Its Possible! Fantastic!
will try out the dynamic layout graphs with Tableau, thanks!

2. Aman says:

Looks like the Tableau workbook has been removed. Can you please post it again?

• I just tried it and verified that the file is there but Skydrive service seems to be unreliable recently. Page sometimes opens, sometimes doesn’t.

3. Incredibly helpful. I have been embedding D3 and Sigma.js based graph viz in my dashboards via invocation of a web service. This is a great alternative.

4. Hi Bora,

In your R script that does the work for the calculated field **GraphNodes**, could you please help me to understand the reason why you are joining the data frame **c** back to a vector named **allusers**, which is itself populated from the field called **AlternatingName** in your data source ?

I’m trying to follow along, but it’s not clear what this section of your script is doing, or what the purpose of this **AlternatingName** field really is.

Thanks!
Keith

• Hi Keith,
Tableau is drawing the graph as individual lines each consisting of two points. So a given vertex will appear as many times as there are edges connecting to it. What R returns is coordinates for each vertex. But for calculation to work, results from R calculation has to match rows in the original table/data source in Tableau. So these coordinates will need to be replicated before being pulled into Tableau. Joining with allusers achieves this replication. I hope this helps.

~ Bora

• Hi Bora, thanks.

I’ll have to study it a bit more deeply to understand. So far I’ve been concentrating on the R piece & havent yet looked at how the tableau Vis is built.

I do understand ‘vector in / vector out’. So, probably once I grok how the Vis was built then it will be clear. One piece that would help: you mentioned we could see how the underlying data structure was built in the Custom SQL. But since the workbook connects directly to an extract, the original data & Custom SQL are hidden, actually.

If you still have them, it would be nice to see the Custom SQL that was used to produce the extract.

I have some nice things I want to do w/ Tableau & the Twitter API & directed graph visualizations. But I’ll have to get past this part first =)

Thanks!

• Then this might help

SELECT [Sheet1\$].[NumTimes] AS [NumTimes], [Sheet1\$].[RetweetedBy] AS [RetweetedBy], [Sheet2\$].[Followers] AS [Followers], [Sheet2\$].[User] AS [User], 1 as [PathOrder],[Sheet2\$].[User] AS AlternatingName FROM [Sheet1\$] RIGHT JOIN [Sheet2\$] ON [Sheet1\$].[User] = [Sheet2\$].[User] UNION ALL (SELECT [Sheet1\$].[NumTimes] AS [NumTimes], [Sheet1\$].[RetweetedBy] AS [RetweetedBy], [Sheet2\$].[Followers] AS [Followers], [Sheet1\$].[User] AS [User], 2 as [PathOrder], [Sheet1\$].[RetweetedBy] AS AlternatingName FROM [Sheet1\$] INNER JOIN [Sheet2\$] ON [Sheet1\$].[RetweetedBy] = [Sheet2\$].[User])

You should be able to get to this by right clicking on the data source and then properties. In the table you’ll see a row that contains the query.

5. Hey! What do you know, you could get the custom SQL of an extract from the data source properties. I had never seen that before.

Thanks! I’ll be tinkering w/ this & hopefully get my own directed twitter graphs working, as well.

6. Raj says:

Hi Friends it’s great to know that we can work with network graphs in tableau using R. But when I tried to create a network with the same example I am getting the error
Error in .Call(“R_igraph_layout_lgl”, graph, as.double(params\$maxiter), :
At type_indexededgelist.c:750 : cannot get neighbors, Invalid vertex id

I installed R on my machine and still getting the same mesg.
Kindly let me know the solution if any.

• Is suspect this might be related to lack of connectivity between some vertices. Is everything connected or do you have a lot of orphan vertices?

• Britany says:

Hello,
I am also receiving this error. I do believe I have a lack of connectivity between some vertices but I would still like to be able to plot and view this. Do you know of a workaround?

Demo has been very insightful!

7. Alex says:

Hi,

Do you have the tweets.xlsx file sample?

Thanks

8. Alex says:

Hi,

Is it possible for R to consider weighted edge and display the thickness of the edge in the network graph in Tableau?

Thanks

• Alex says:

Hi,

Thanks for the raw data file. I am exploring whether Tableau is able to display the thickness of the edge between 2 nodes if the data contain a weighted value. Is it possible to seek your advise/suggestions on this?

Thanks

• Sure. Just drag the field that contains the weight value onto the size shelf for the line mark. It will set the line thickness.

9. claudia says:

Hi, Thanks a lot for the detailed walkthrough. I managed to replicated it with my data, but I’m afraid it wont be suitable for my larger data files. I have an edge list file for around 10K nodes, with one or more relationships, around 30K records (only direct links). I would really love to be able to produce the geodesic coordinates in R and use Tableau filtering (workbook will be uploaded to server to share), with decent load speeds.
I’ll appreciate any tips.

• How often does the data change? Can you write the results from R into a CSV and just open that in Tableau instead. It would be faster than recomputing the layout on-the-fly.

10. When I try and recreate this, using the exact same data and calculated fields, I get an error when adding the X and Ys to the shelf:

“Error occurred when communicating with the Rservice. Error in `[.data.frame`(y, by) : undefined columns selected”

Any idea why this would be? Even if I remove them from your working workbook and just replace them, it gives that error.

Your workbook works fine when opened as-is, but can’t be recreated.

• Most likely issue is the addressing/partitioning settings. When you remove and re-add them, it reverts back to defaults. I would right click, edit table calculation and look at the Advanced settings and make sure in your recreation settings are the same.

• Jason says:

Hi Bora,

Thanks so much for this helpful workbook and guide! I just happened to have some results in iGraph that I wanted to get into Tableau.

Like Jacob, I have the same error whenever I try to alter the workbook. It seems to go beyond the issue of addressing/partitioning, as I can’t even drag any of the GraphNodes calculated fields onto Tooltip or Filters. I’m using Tableau 9.0- could that be the reason why?

Thanks,
Jason

• I faced the same problem when I tried to recreate the graph. So I decided to make a duplicate sheet and modify it and see the difference with a new one. Finally I solved the problem by making the following actions on the new sheet:
1. Click on “Analysis” menu and uncheck “Aggregate Measures”.
2. After adding all the axes and adding “User” and “RetweetedBy” to details, right click on each axis (including X axis) and click “Edit Table Calculation” then Compute using “Specific Dimensions” and check both “User” and “RetweetedBy” (ignore any error that may appear until last edit).
3. If you add the “PathOrder” to the path, it may show an error. To fix this, right click on the added “PathOrder” path and choose “Attribute”.

11. SANDEEP says:

Hello experts,
I am a newbie to the art of Data mining.
Great article- found this while trying to extract Coordinates and its corresponding Vertices out to SQL Server.

So I tried the “cbind(coords, data.frame(users=V(mygraph)\$name));” query and I was able to get the Users and coordinates.

On the next query “c<-join(allusers, c, by = 'users');", am getting an error couldn't find function JOIN. What package do I need to install to get JOIN function?

Any help would be greatly appreciated!

• plyr is the R package you’re looking for.

12. Thank you so much for the step-by-step instruction. It does run faster if we created the edge list outside Tableau and import the csv file to Tableau. I added two lines to the GraphNodes calculation that aggregate weight and remove duplicates and loop interactions. mydf <- aggregate(mydf[,3],mydf[,-3],sum); mygraph <- simplify (mygraph, remove.multiple=F, remove.loops=T);

13. Brian says:

Error in layout_with_lgl(structure(list(31, TRUE, c(0, 1, 2, 3, 4, 5, : unused argument (weights = c(3, 2, 6, 8, 10, 10, 4, 18, 5, 9, 5, 14, 10, 4, 4, 8, 3, 2, 5, 8, 4, 8, 2, 3, 8, 5, 3, 7, 5, 10, 4, 5, 10, 4))

Any ideas for how to troubleshoot? My RServe connection is working, I installed igraph in R, and loaded the package. To be honest, I’m pretty rusty with R.

Thanks!

• Do you still get this message if you use it with a different layout algorithm option e.g. Fruchterman-Reingold? From the message it sounds like there are changes to igraph so it is complaining about some data being passed that is not used by the method.

• Pranshu says:

I am getting the same error. Did we get the solution for this ?

• Jay says:

Bora, you were right. The workbook functions with a different layout algorithm option in this case.

14. Borja Leiva says:

HI Bora, great article!!! Thanks. I downloaded the workbook to practice but when I open it I get this message from Tableau “No external service configured. Specify a server name and try again” What does it mean?

• Hi Borja,
You need to go to Help > Settings and Performance > Manage External services and enter the host name and port number for a machine that is running Rserve.

15. Hi Bora. Great article. I was extending your work on a work I have. I have the First Category, Second Category, Relationship between them, the users and the Path Order. I replicated the GraphNodes calculation you have but I am getting all NA~NA~NA on my side. What am I doing wron?