What Happens When Power BI Direct Lake Semantic Models Hit Guardrails?

Direct Lake mode in Power BI allows you to build semantic models on very large volumes of data, but because it is still an in-memory database engine there are limits on how much data it can work with. As a result it has rules – called guardrails – that it uses to check whether you are trying to build a semantic model that is too large. But what happens when you hit those guardrails? This week one of my colleagues, Gaurav Agarwal, showed me the results of some tests that he did which I thought I would share here.

Before I do that though, a bit more detail about what these guardrails are. They are documented in the table here, they vary by Fabric capacity SKU size and there are four of them which are limits on:

  • The number of Parquet files per Delta table
  • The number of rowgroups per Delta table
  • The number of rows per table
  • The total size of the data used by the semantic model on disk

There is also a limit on the amount of memory that can be used by a semantic model, something I have blogged about extensively, but technically that’s not a guardrail.

Remember also that there are two types of Direct Lake mode (documented here): the original Direct Lake mode called Direct Lake on SQL Endpoints that I will refer to as DL/SQL and which has the ability to fall back to DirectQuery mode, and the newer version called Direct Lake on OneLake that I will refer to as DL/OL and which cannot fall back to DirectQuery.

For his tests, Guarav built a Fabric Warehouse containing a single table. He then added more and more rows to this table to see how a Direct Lake semantic model built on this Warehouse behaved. Here’s what he found.

If you build a DL/SQL model that exceeds one of the guardrails for the capacity SKU that you are using then, when you refresh that model, the refresh will succeed and you will see the following warning message in the model Refresh History:

We noticed that the source Delta tables exceed the resource limits of the Premium or Fabric capacity requiring queries to fallback to DirectQuery mode. Ensure that the Delta tables do not exceed the capacity's guardrails for best query perf.

This means that even though the refresh has succeeded, because the model has exceeded one of the guardrails then it will always fall back to DirectQuery mode – with all the associated performance implications.

If your DL/SQL model exceeds one of the guardrails for the largest Fabric SKU, an F2048, then refresh will fail but you will still be able to query the model and the model will again fall back to DirectQuery mode. For his tests, Guarav loaded 52 billion rows into a table; the guardrail for the maximum number of rows in a table for an F2048 is 24 billion rows. The top-level message you get when you refresh in this case is simply:

An error occurred while processing the semantic model.

Although if you look at the details you’ll see a more helpful message:

We cannot refresh the semantic model because of a Delta table issue that causes framing to fail. The source Delta table '<oii>billionrows</oii>' has too many parquet files, which exceeds the maximum guardrails. Please optimize the Delta table. See https://go.microsoft.com/fwlink/?linkid=2316800 for guardrail details.

The DAX TableTraits() function, which another colleague, Sandeep Pawar, blogged about here, can also tell you the reason why a DL/SQL semantic model is falling back to DirectQuery mode. Running the following DAX query on the 52 billion row model:

EVALUATE TABLETRAITS()

…returned the following results:

This shows that the table called billionrows actually exceeds the guardrails for the number of files, the number of rowgroups and the number of rows.

What about DL/OL models though? Since they cannot fall back to DirectQuery mode, when you try to build or refresh a DL/OL semantic model that exceeds a guardrail you’ll get an error and you won’t be able to query your semantic model at all. For example here’s what I saw in Power BI Desktop when I tried to use the 52 billion row table in a DL/OL model:

Something went wrong connecting to this item. You can open the item in your browser to see if there is an issue or try connecting again.
We cannot refresh the semantic model because of a Delta table issue that causes framing to fail. The source Delta table 'billionrows' has too many parquet files, which exceeds the maximum guardrails. Please optimize the Delta table. 

All of this behaviour makes sense if you think about it, even though I wouldn’t have known how things work exactly until I had seen it. Some behaviour may change in the future to make it more intuitive; if that happens I will update this post.

[Thanks to Gaurav for showing me all this – check out his podcast and the India Fabric Analytics and AI user group that he helps run on LinkedIn. Thanks also to Akshai Mirchandani and Phil Seamark for their help]

Share this Post

Comments (0)

Leave a comment