Finding Power BI Queries And Refreshes That Have Been Throttled With Workspace Monitoring

A lot of new documentation was published recently around Fabric capacities, including documentation on how to size and govern capacities and how to troubleshoot problems with capacities. The latter has instructions on how to use the Capacity Metrics App to determine if your Power BI queries are being throttled; it mentions that you can also determine if queries have been throttled using Workspace Monitoring or Log Analytics but doesn’t go into details about how, something I will address in this post.

The key to using Log Analytics and Workspace Monitoring to troubleshoot throttling is the Execution Metrics event. Queries and semantic model refreshes always have an Execution Metrics event associated with them, and you can make the connection between all events for a specific operation using the OperationId column in Workspace Monitoring (as mentioned in my last post) or the XmlaRequestId column in Log Analytics – these columns contain the same information, it’s just that the column name was changed in Workspace Monitoring. When a query or a refresh is delayed because of throttling the Execution Metrics event will include a metric called capacityThrottlingMs, which gives the total amount of time in ms that the query or refresh was delayed before it was allowed to execute.

The documentation for Execution Metrics contains a sample KQL query for Log Analytics that uses Execution Metrics to return aggregated data on throttling but doesn’t explain how it works. Here’s a version of that query rewritten for Workspace Monitoring that returns the individual query and refresh operations that were throttled in the last five days:

let
EventsLast5Days =
SemanticModelLogs
| where Timestamp > (5d)
| where OperationName in ("CommandEnd", "QueryEnd", "DiscoverEnd", "ExecutionMetrics");
let 
ExecutionMetricsEvents =
EventsLast5Days
| where OperationName == "ExecutionMetrics"
| project ExecutionMetrics = EventText, OperationId
| extend EM = parse_json(ExecutionMetrics)
| extend capacityThrottlingMs = toint(EM.capacityThrottlingMs)
| project-away  EM, ExecutionMetrics
| where capacityThrottlingMs > 0;
let
QueryDiscoverRefreshEvents = 
EventsLast5Days
| where OperationName in ("CommandEnd", "QueryEnd", "DiscoverEnd")
| project Timestamp, OperationName, OperationDetailName, QueryText = EventText, DurationMs, CpuTimeMs, OperationId, ExecutingUser;
ExecutionMetricsEvents
| join kind=inner QueryDiscoverRefreshEvents on OperationId
| project-away OperationId1

Here’s how this query works:

  • The EventsLast5Days variable gets all the events in the last 5 days which have an OperationDetailName of either CommandEnd (which is fired when a refresh finishes), QueryEnd (which is fired when a DAX or MDX query finishes), DiscoverEnd (which is fired when a Discover command, which is used to retrieve metadata from the Power BI engine, finishes) or ExecutionMetrics.
  • The ExecutionMetricsEvents variable takes the EventsLast5Days variable and filters it down to just the ExecutionMetrics event. It then parses the JSON value containing the metrics in the EventText column, extracts the capacityThrottlingMs metric if it is present, and filters the events down to just those where capacityThrottlingMs is greater than 0.
  • The QueryDiscoverRefreshEvents variable takes the EventsLast5Days variable and filters it down to just the CommandEnd, DiscoverEnd and QueryEnd events.
  • Finally there is an inner join between the ExecutionMetricsEvents variable and the QueryDiscoverRefreshEvents variable on the OperationId column, which results in a table of the CommandEnd, DiscoverEnd and QueryEnd events where throttling was present.

A couple of important things to point out:

  • The duration of an event in Workspace Monitoring, given in the DurationMs column, does not include any delays added as a result of capacity throttling. As a result you can have queries that take 0ms but which were delayed by 20 seconds – something you might not realise if you only look at the DurationMs column.
  • Having the DAX queries that are run when you display a Power BI report is bad enough, but throttling delays are also applied to Discover commands, and Power BI reports (and indeed every client tool) will run a number of these before they run the queries they needs to render a report. This means that the impact of throttling on report performance can be a lot greater than you might expect.
  • The new Timepoint Item Detail page in the Capacity Metrics App, in preview at the time of writing, includes a slicer on the OperationId column. This means that you can do the opposite of what I showed in my previous blog post and filter data in the Capacity Metrics App using an OperationId found from a query in Workspace Monitoring. This would allow you to find the percentage of the available CUs on a capacity that a given query or refresh used.

Share this Post

Comments (0)

Leave a comment