Visualization

What’s new in Tableau 9.0? Part 2 – Level of Detail Expressions

Level of detail expressions extend Tableau’s calculation language by introducing the capability to define at what level aggregations should happen. It is a small syntax change but implications are HUGE!

I will focus on the technical details in this post with a few examples to clarify the concepts. Stay tuned for a series of posts on Tableau website from our Product Consultants which will cover a range of business questions you can answer using level of detail expressions with lots of example workbooks, perfect for some inspiration.

What is level of detail anyways?

In Tableau, the results for a calculation such as SUM([Sales]) depend on the context. The context is defined by the filters and level of detail. Dimensions on rows, columns, color, size, label, detail or path shelves define the level of detail for the sheet.

For example if [State] dimension is on Row shelf, SUM([Sales]) will give sum of all transactions for each [State]. If [Product Type] is also on one of the shelves I mentioned above then SUM([Sales]) will give sum of all transactions within each [State] for each [Product Type]. The more dimensions in your sheet or the more unique members each dimension contains, the more granular your results will be. Since each result as drawn as a mark in your visualization, the finer the level of detail for your sheet, the more marks you will have.

Below is an example using the Coffee Chain dataset. The image on the left has Product Type as the only dimension in the sheet while the one on the right also has Product, giving it finer granularity hence more marks.

Product type vs Product, coarser and finer level of detail respectively

Filters on the other hand change the context by reducing the number of data points used in the calculation e.g. remove a state by name or where profit is below zero or that are not in TOP 5 etc. Using table calculations as filters is an exception to this however since they just hide marks without changing the data over which the calculations operate.

Level of Detail Expressions

Level of detail expressions allow you to specify the context for your calculations. For example the following expression

{fixed [State] : SUM([Sales])}

will always calculate sum of sales per state regardless of sheet’s level of detail. It also ignores the filters in effect except context filters, data source and extract filters.

Tableau has had several mechanisms to calculate at a coarser level of detail than the sheet or perform multiple levels of aggregation for many releases. Reference lines, totals and table calculations have been commonly used methods for these tasks which worked very well in some cases but not others. What level of detail expressions bring to the table is the ability to use the results mixed with row level values, push the mixed and multi-level aggregate calculations to the database (as opposed to local calculations with other methods) and use results as dimensions, for table calculation addressing/partitioning, in actions or bin them.

Level of detail expressions have the following structure

{keyword [dimension1],[dimension2]… : Aggregate expression}

The portion highlighted in blue is the dimensionality expression which consists of 0 or more dimensions. If there is more than one dimension, they must be separated with commas. Aggregate expression does not necessarily have to be a simple aggregation e.g. SUM([Sales]) you can insert other calculations here, even wrap an IF block inside a level of detail expression.

There are 3 different keywords

Fixed : Limits the level of detail of the aggregate expression to only the dimensions listed in the dimensionality expression. I will cover the interaction with filters towards the end of this post.
Include : Defines the level of detail for the aggregate expression by combining the dimensions listed in the dimensionality expression with dimensions in the sheet. It is great when you want to calculate at a fine level of detail in the database then re-aggregate and show at a coarser level of detail in your visualization and/or when you want your calculation’s level of detail change when you drill down or add more dimensions to your Sheet.
Exclude : Defines the level of detail for the aggregate expression by removing the dimensions listed in the dimensionality expression from dimensions in the sheet. It is great for ‘percent of total’ or ‘difference from overall average’ scenarios. Can be considered as the written/textual representation of features like Totals and Reference Lines.

Specifying no dimensions using fixed keyword e.g. {fixed : MIN([Order Date])} gives a single result for the entire table. Shorthand for {fixed : MIN([Order Date])} is {MIN([Order Date])}. This gives you the date your store made its first sale, when you earned that dollar bill you got framed 🙂

Since level of detail of the sheet determines the number of marks drawn in the visualization, when your calculation has a different level of detail, something needs to be done to reconcile the difference.

Here is what Tableau will do automatically for you.

If your calculation is of coarser level of detail, Tableau will replicate the results as needed  such that there is a value for each mark. In the example below

{fixed [Segment] : SUM([Sales])} is getting replicated for each Category, Segment.

Replication behavior when calculation has coarser level of detail than the sheet

This allows you to the following to calculate the difference between a Transaction Amount and the AVG Transaction Amount by that customer.

[Transaction Amount] – {fixed [Customer ID] : AVG([Transaction Amount])}

Note that this calculation mixes an aggregate with a row level value but doesn’t give you the error you’re used to seeing when you have aggregates and non-aggregates in the same calculation. Yet another benefit of using level of detail expressions!

If your calculation has finer level of detail, Tableau will aggregate the results as needed  such that there is only one value for each mark. In the example below

{fixed [Segment], [Category] : SUM([Sales])} is getting AVG’ed up to one value per segment (assume that the pill in the sheet has AVG as aggregation).

Aggregation behavior when calculation has finer level of detail than the sheet

With include keyword calculation will always either have same or finer level of detail than the sheet which requires aggregation to match sheet’s level of detail. For fixed it can be coarser or finer depending on the calculation and dimensions in the sheet. Since the need to aggregate or not depends on what dimensions are in sheet, it can change anytime. Having an aggregation defined is safe since if and when aggregation is needed, you already told Tableau how to do it. Because of all this, calculations with custom level of detail are always wrapped in an aggregate when in the sheet unless they’re used as dimensions. If no aggregation is happening (e.g. calculation has the same level of detail as the sheet or is coarser hence getting replicated instead) then changing the aggregation will not make any difference in the results. You will notice that when you drag a pill that is an exclude calculation, Tableau will default to ATTR (as opposed to SUM or AVG as you’re used to see with measures) to hint that what’s happening isn’t aggregation.

How does it work?

Level of detail expressions rely on database queries to calculate the results. If you look at your Tableau logs, you will notice them as sub-queries with inner joins and sometimes with cross joins. If your database doesn’t have a CROSS JOIN operator, Tableau will join without an ON/WHERE clause or use one that always returns true to get the same effect as long as your database supports it.

Cross-joins happen when there is no common dimension to join on. For example {MIN([Order Date])} is a single value repeated for every row hence there is no need for a join key.  It is a cross-join between the main table (that defines the sheet) and a one-row table (resulting from the calculation).

Of course this is not the only case. For example you can have a calculation such as {exclude [State]  : AVG([Sales])} inside a Sheet where [State] is the only dimension. This would also result in “blank” dimensionality again hence a cross-join query will be generated.

Level of Detail Expressions and Filters

There are several different kinds of filters in Tableau and they get executed in the following order from top to bottom.

Filters in Tableau

I colored Extract Filters and Table Calculation filters differently since former is only relevant if you’re creating a Tableau Extract from a data source and latter applies after the calculations are executed hence hides the marks without filtering out the underlying data used in calculations.

If you’re familiar with SQL, you can think of measure filters as HAVING clause in a query and dimension filters as the WHERE clause.

Fixed calculations happen before dimension filters, as a result unless you promote the pills in your filter shelf to “Context Filters” they will be ignored. There are many benefits to this. Imagine writing the following calculation

SUM([Sales]) / ATTR({SUM([[Sales])})

and having the dimension [State] in your Sheet. With this setup, this calculation will give you ratio of a State’s sales to total Sales. You can add a filter to your Sheet to hide some of the States but filter will only affect the numerator. Since denominator is fixed, it will it will still give the sum for all States. When you also consider taking advantage of this with filter actions in dashboards, you can imagine how powerful it can be.

Include and Exclude calculations happen after Dimension filters like any other (except fixed) measure in Tableau. So if you want filters to apply to your level of detail expression but don’t want to use Context Filters, you can rewrite your calculation using Exclude or Include keywords.

Level of Detail Expressions and Totals

There were several questions about this on Tableau forums and Beta e-mail alias so I decided to add a section to the post to cover this topic as well.

First, let’s talk about how Tableau computes Totals. In Tableau there are two kinds of totals. Totals (aka single-pass totals) and Visual Totals (aka two-pass totals). Former applies the current aggregation at a coarser level of detail than the Sheet and is the default setting when you turn on Totals. This option appears in the Total Using menu as “Automatic”.

Below is an example of single-pass Totals where you can see that subtotals in the first image on the left match AVG(Sales) per Category  (upper right), while Grand Total matches the AVG(Sales) for the entire table (in the lower right) i.e. when no dimensions are in the Sheet.

 Totals (aka single-pass totals)

If you wanted the Total rows to show the AVG of the rows above them e.g. the average of the values displayed for Bookcases, Chairs, Furnishings and Tables then you would need to use Visual (aka two-pass) totals. You can switch to visual totals by changing “Total Using” setting from “Automatic” to the desired aggregation to be used in second pass.

Note that values for single pass and two pass will be the same when what you’re doing is sum (“Total Using” setting) of sum (pill’s aggregation), max of max or min of min.

In the example below, Total Using is set to AVG. So subtotals show the AVG of SUM of sales per Sub-category within each Category. Green box shows the result while green line indicates the values that were averaged to get that result. Similarly the Grand Total line at the bottom (blue box) shows the AVG of all the rows marked with blue line. I renamed the total cells in this case so they are more descriptive of what’s happening.

Visual (aka two-pass) totals

Now that we clarified the basics, let’s move onto the main question of how totals and level of detail expressions play together.

Even though a single pass grand total applies the aggregation at table granularity i.e. as if no dimensions were in the table this does not affect what happens inside the curly braces.

For example if your visualization contains the dimension [Product Category] and your calculation is
AVG({include  [Customer Name]:SUM([Sales])})

LOD expression will still calculate at [Customer Name] , [Product Category] level for the grand total. Total row calculation only changes how the outer aggregate behaves (AVG in this case) so the results of the LOD calculation would get averaged up to [Product Category] for each mark in the Sheet but to table level (single value) for the grand total.

Single-pass Totals are not affected by the replication that is triggered by LOD expressions coarser than the Sheet. However please keep in mind that Visual (aka two-pass) totals do.

Understanding Nesting

You can nest several layers of level of detail calculations to answer even more elaborate business questions.  One thing to keep in mind though is the fact that context for a calculation is defined by its parent(s) as opposed to the sheet. For example :

{fixed [State] : AVG({include [Customer] : SUM([Sales])})}

it will have the same effect as writing

{fixed  [State] : AVG({fixed [State], [Customer] : SUM([Sales])})}

since nested calculation inherits the dimensionality from the outer calculation in the first case resulting in [State] + [Customer] = [State], [Customer]. Also since outer calculation is fixed, nested calculation will not be affected by the filters in the sheet.

And that’s not just it. Aggregation/replication behavior I explained earlier in this post also applies to nested calculations. Let’s walk through an example…

Assume you are trying to calculate the average customer spending in each State but you realize that customers travelling and spending small amounts in multiple states are skewing the results and instead of filtering those out, you decide to calculate the total spending for each Customer and use that value towards the State average. You could do this by writing the following calculation

{fixed [State], [Customer] : AVG({exclude [State] : SUM([Spending])})}

And set the pill’s aggregation to AVG when you drag it into your sheet.

In this example, nested calculation will give the SUM of [Sales] per [Customer]. However the outer aggregation has the dimensionality of State, Customer which is finer. This will result in replication of total sales per Customer for each State in which customer ever made a purchase.

Anatomy of a nested level of detail calculation

In this case since what happens between the two levels of calculations is replication, what aggregation you use here (AVG({exclude [State] : ….) doesn’t matter. The final step that averages customer totals to State level happens through the pill’s aggregation setting and the fact that State is the only dimension in your Sheet.

While this is a good example for explaining how nesting works there is even a simpler way of writing this calculation :

{fixed [Customer] : SUM([Spending])}

You just need to drag it into your sheet with the dimension State and set the pill’s aggregation to AVG.

This is because when Tableau generates this INNER JOIN, it does so between the dimensionality of the level of detail (LOD) expression and dimensionality that is the union of  the sheet and the LOD expression’s dimensionalities. In this case, the union of dimensionalities is Customer (from the calculation) + State (from the Sheet) which will result in the same replication behavior as in the nested case.

Working with disjoint LODs

When you think of all that’s happening to calculate the result, syntax to do it is amazingly simple.

Limitations

  • Calculations that involve level of detail expressions won’t be materialized in Tableau Extracts in  9.0.
  • Sets, combined-fields (use dimensions that make up the combined field individually by separating them with commas instead), parameters, table calculations cannot be used in the dimensionality expression in 9.0
  • Only field names can be used in dimensionality expression (e.g. an expression like MAX([Order Date]) or YEAR([Order Date]) or an IF statement cannot be used in the dimensionality expression). Note that you can use a calculated field that has the formula YEAR([Order Date]) and use that instead. If you use {exclude [Order Date] : SUM([Sales])} and have [Order Date] in your sheet, Tableau will recognize the date hierarchy and automatically exclude the dateparts from calculation’s level of detail as you drill down.
  • Table calculations and ATTR are not allowed in aggregate expression e.g. {fixed [County] : ATTR([State])} or {fixed [County] : INDEX()} are not supported
  • Dimensionality from one source with measure from another blended source can not be mixed in a level of detail calculation
  • All dimensions in the dimensionality expression should come from the same data source
  • Level of detail expressions that use exclude or include keywords can not be used as dimensions. They also can not be binned. This is because they have relative dimensionality. While this may appear like an arbitrary limitation, it makes them much more understandable. Consider the case where there are two include calculations in the Sheet that are used as dimensions. Calculation1 is {include x : min(i)}, Calculation2 is {include y : min(k)}. If there are no other dimensions in the Sheet, Calculation1 will behave like {fixed x, Calculation2 : min(i)} and Calculation2 will behave like {fixed y, Calculation1 : min(k)}. Circular reference! This is a solvable problem but at the expense of understandability. We chose ease of use and simplicity instead.
  • Level of detail calculations from secondary sources can not be used via data blending 2 in 9.0
  • In 9.0 results from Level of detail calculations are cached as a whole (not broken down per sub-query) so generating same sub-queries in different sheets/calculations won’t result in cache hits.

Data source support

In Tableau 9.0, level of detail expressions are not supported for the following data sources: Cubes, Big Query, DataStax, Informatica Data Services, Microsoft Jet, Splunk, Actian Vectorwise.

Following data sources has higher version requirements for level of detail calculations than minimum supported versions per Tableau technical specifications here :

Teradata 12+, Netezza 7+, HIVE 0.13+, Impala 1.2.2+

Rest of the data sources support level of detail calculations with the same minimum version requirements as listed in technical specs. The list also includes the new connections not yet listed in technical spec such as Apache Spark, IBM BigInsights and SPSS, R, SAS data files which adds up to a total of 36 connection types! Note that the new Excel and Text connectors are not Jet based so they have full level of detail calculation support as well.

Generic ODBC has limited support as cross-join syntax varies from database to database. What Tableau issues through generic ODBC may not be what’s supported by the database or database may not support cross-joins at all. So some scenarios may not work. It is advised that you use dedicated connection types where available for the best experience.

If you’re connecting through Tableau Data Server, it also has to be Tableau 9.0 or newer.

Performance
Level of detail expressions rely on database queries to calculate the results. Performance heavily depends on your database and size of the data, whether necessary indexes are in the database to be able to perform joins rapidly, complexity and level of nesting in your calculation. If your dataset is large and database is fast, level of detail expressions can give you vast performance improvements over Table Calculations since you can push finer detail calculations directly into the database instead of moving a large dataset over the wire into Tableau on your desktop.

You can make your calculations run more efficiently on the database, by avoiding unnecessary cross-joins. Also if you have large tables, you can get performance gains by using MIN instead of ATTR when you know replication will happen. For ATTR, Tableau queries for both MIN and MAX to see if it needs to show * or not. If you know your calculation will replicate, you already know MIN will be same as MAX. This way you can get the same results and shave-off a sub-query.

Other Notes

When using level of detail calculations as dimensions, be careful if your aggregation is returning floats since floating point arithmetic does not give results that are reliable for equality comparisons (JOINs check for equality) and this may result in unexpected results.

Conclusion

Congratulations! You made it to the end of the post!  I agree it was a long post but when you think about it, I think it is pretty cool that it was possible to fit most technical details in a single blog post. This would probably be a 3 volume book that comes in hardcover in the old world of business intelligence and calculations would only work on one platform. Being able to do this with 30+ different databases with such little visible change in the calculation language is very exciting.

Do you know what’s more impressive and excites me even more? All the cool things you will do with it… All the insights you will discover… Enjoy!

Standard

41 thoughts on “What’s new in Tableau 9.0? Part 2 – Level of Detail Expressions

  1. I posted some updates to cover the changes we made in Beta5 and Beta6 so content is up-to-date as well as adding a section to address some questions that came up during the Beta program.

  2. Thank you for this great post! I have been struggling segmenting customers based on the amount of their monthly invoice and counting how many invoices were generated greater than, let’s say, 10000 per customer. Thanks to your post I got a sudden inspiration to create following field :
    { FIXED [Date], [Customer] : SUM(IF ([Subtotal]) >=10000 THEN 1 ELSE 0 END)} with [Customer] on my Sheet. I have read many blog posts about LoD/Tableau 9, but I did not really get a grasp to it – until now! Including nested statements really helped me a great deal! If not I was going to back and forth with excel which would not be as autonomous and ideal. You saved my day; thanks again!

  3. Tim Benschop says:

    Great post, thanks Bora. Since you are also quite into the R and Tableau connection, I have a question which is related to Tableau V9 LOD combined with R. These calculated fields seem to be automatically table calculations, also in Tableau 9. I would expect it should be possible to get rid of the table calculations and use the LOD to create the view. Something like this:
    { FIXED [observations]:
    SCRIPT_REAL(”
    c(1:5)

    ,sum([Income]))
    }
    This is just an example where I want to send a vector with length 5 to Tableau. Normally I would use the table calculation compute using [observations] (which has length 5). But I would guess this should be possible without the table calculation?

    I get the error message: “Level of detail expressions cannot contain table calculations or the ATTR function.

    Thanks.
    Tim

    • This is expected. LOD calculations are executed in the data source as subqueries and bound by the capabilities of database e.g. can do SUM, AVG, MAX etc. but can’t execute arbitrary R code.

  4. eybear says:

    Can you go into technical detail about the data source limitations with LOD expressions? Is it really just the cross-join functionality or something else too?

    • It is not just cross-joins. For some databases queries were too complex for the database optimizer to return meaningful results so sometimes they would fail and sometimes results would change between each run. For some databases it was because they don’t support certain types of simple queries e.g. Tableau may do the following ((Starbucks.Market=t0.Market) OR ((Starbucks.Market IS NULL) AND (t0.Market IS NULL))) which some databases will fail to execute. We just didn’t want to put our customers in a position where things sometimes work, sometimes don’t and the results sometimes can be wrong and they need to read a long troubleshooting article to figure those out or not even realize the results were wrong because they didn’t come across that article.

      • eybear says:

        Thanks for the reply Bora!

        I assume the Actian Vectorwise falls into one of the failure modes as well?

      • That is correct. Sometimes LOD queries get very complex. In our tests Vectorwise’s optimizer behaved inconsistently. Sometimes the same query would return different results. So we decided to turn it off for Vectorwise for the time being.

  5. Matias Casassus says:

    Before starting with Tableau, I was heavily into MDX, so I really missed that kind of control over aggregations (leaves, levels…) on non-cube data sources.
    But here it is!
    Thank you for the good news and thorough explanation

  6. Amit says:

    Hi Bora,

    Just one quick question.. am trying to use FIXED calculation on Tableau extract and even though the extract is small but the performance has really gone down… Anyway of improving performance with detail calculations….

  7. Pingback: Tinkering With LOD Expressions & Percent of Total « Tableau Tinkering

  8. Pingback: Creating the Top 5 with a %Total in a Specific Month within Tableau | James Dunkerley's Blog

  9. Nick Rao says:

    Solved my issue within 10 minutes. And last week I spent 5+ hours on this. Truly awesome function! Thumb up to Bora and Tableau 9.

  10. Pingback: An Example of #Tableau LOD Calcs, Filters and Order of Operations | 3danim8's Blog

  11. Pingback: Oh, LOD, why?! Reasons to Channel the Force | Tableau Fit

  12. juanhdv says:

    I have seen some examples in tableau’s forum specifying no dimensions with include or exclude keywords, which is the meaning of such specification?

    • {MIN(Sales)} will give you the minimum sales for the entire table. Like dragging min(sales) pill into a view that contains no dimensions. {MIN(Sales)} is also the shorthand for {fixed : MIN(Sales)}. {include : MIN(Sales)} or {exclude : MIN(Sales)} will give you the same value and they both compute at the level of detail of the sheet (they add nothing and they remove nothing) so if you have {include : sum(sales)} and State as a dimension it will give you the same value as sum(sales) in that viz which is sum(sales) per State.

  13. Hi Bora, What a wonderful article. Of all the Tableau bloggers I have read, you have a gift for explaining things in a way without using too much Tableau or statistical jargon. The article is very easy to understand. I look forward to reading more article from you!

    I did not understand the part about circular joins between two INCLUDE LODs if they are allowed to be dimensions – is there a typo there?

      • Thanks for the response Bora. I am quoting you here

        “Consider the case where there are two include calculations in the Sheet that are used as dimensions. Calculation1 is {include x : min(i)}, Calculation2 is {include y : min(k)}. If there are no other dimensions in the Sheet, Calculation1 will behave like {fixed x, Calculation2 : min(i)} and Calculation2 will behave like {fixed y, Calculation1 : min(k)}. Circular reference!”

        Why would calculation1 behave like {fixed x, Calculation 2: min(i)}? Is calculation referring to y? As per your text, I do not see any link between the 2 calculations and was wondering how circular reference is possible

  14. Hi Bora,

    It would be great if you can explain this statement with some examples in one of your future blog posts

    Include is helpful “when you want your calculation’s level of detail change when you drill down or add more dimensions to your Sheet.”

  15. Lana Dang says:

    HI Bora, thanks for the excellent post. I can see here that LOD functions allow for mixing non-aggregate and aggregate in one calculated field. It works well if the 2 fields are from the same underlying dataset.
    Is there a way to use LOD functions to allow for mixing non-aggregate and aggregate fields, whereby the non-aggregate field is from a different dataset from that of the aggregate field? (the two dataset are blended using a common field). I look forward to your help on this. Thank you.

    • Blended fields (from secondary source) always come as aggregates. When I say aggregate and disaggregate what I mean is that an LOD expressions aggregate the data but it is treated as a disaggregate in a calculation. E.g. {fixed state:SUM(Sales)} is aggregating by summing per state but it is replicated at row level hence can function as a disaggregate. Blended field comes in as a true aggregates e.g. SUM(Sales) as opposed to as LOD expressions so they can only be mixed with an aggregate. The solution to your problem would be using cross-database joins (new in Tableau 10) which allow row level joins.

  16. Manu says:

    What if i use tableau extract and i want a field that has coarser level aggregation than the sheet(Tableau version 9.0)?

  17. One of the best blogs on LOD so far.
    I want to know if there is a way to use LOD to do MoM comparison to show on a dashboard, i just want to show %change from previous month and use it as a shape or arrow to indicate performance. Any help or link would be much appreciated.

  18. Roya says:

    Hi Bora, Great explanation!
    However I still have problem mixing aggregate and non-aggregate with LOD expression. I have an aggregate formula where I need to use Running_Sum to calculate how a value is increasing or decreasing row by row. On the other hand, I have to apply a set of filters to my dataset bu using LOD and it’s not able to combine these two together. I really appreciate if you could help with this problem.
    I can also include my tableau file if needed.
    Thanks

    • Roya says:

      I just saw this : “Table calculations and ATTR are not allowed in aggregate expression”. I believe using running_sum as table calculation is not allowed in the expression. is there any way around you suggest?

      • This is a capability we’d like to add to LOD expressions in the future however currently you can’t do calculations like running sum this way.

        If you don’t mind writing custom SQL you can use SQL Analytic functions to do this if your database supports it.

      • Roya says:

        Thanks a lot Bora for your comment. Actually I’m not familiar with SQL, and I really stuck with my file. Is it possible to share an example of a solution for this type of problem using SQL?
        My problem is that I have a database with a few categories, and I need to use table calculations for each category (running_sum in my case), and finally combine all categories together in a calculation field using case statement. The problem is that I cannot combine them in case statement since it cannot mix aggregated and non-aggregated data, and I cannot use LOD expression with table calculation!

  19. Helen says:

    Thank you Bora, this is certainly the best LOD post I have found. I come from T-SQL/MDX of SQL database, I have been struggled with the missing control of level/scope in Tableau viz, this very much helps me to grasp the essence of LOD, how it works at the background. Thank you very much!

  20. Pingback: How to Calculate Headcount dynamically from Employees Transaction Records | Insights Through Data

Leave a reply to Touran Cancel reply