Essential checklist to keep your Power BI Solutions optimized. Part I: Optimizing the Data source

 In Performance, Power BI, Power Query

This is a series of articles about the most important practices at every step of your solution to get scalable data models and faster reports:

At the end of this series of articles, you will be able to download the summary checklist in Excel format, I recommend you keep it handy to review it every time you build a Power BI solution from scratch or need to improve the performance of your existing developments.

Part I. Optimizing the Data source

This includes the source, connections, and power query transformations performed prior to loading a table to the data model.

– Ensuring query folding occurs whenever possible: Applying query folding in your power query tables could exponentially reduce resource utilization and refresh time.

When you are working with a relational data source (such as SQL Server) always check the query folding. If most of the steps do not support it, consider applying transformations on the data source using a database view or by physically preparing and materializing the data.

You must have the query folding enabled, to check it right-click on the last applied step in Power Query:

If the “View Native Query” option is grayed out, it means that this specific step or transformation is executed with local resources.

– Minimize the Power Query steps: Consolidate repeated steps, put all “remove columns” together, all “rename columns” together and so on. This change will improve the execution performance of the query.

– If your source is an Excel File, you could convert it to CSV before importing it into Power BI: The performance of reading data from a CSV is better.

To convert the Excel files, you could use the CloudConvert API to CSV files, and then save them to Azure Blob storage.

However, the recommendation for big projects is always to move the data to a database.

– Disable Power Query background refresh: By default, when Power Query imports data it retrieves and stores 1000 rows of preview data for each query.

If your Power BI Desktop file contains many queries, this option could extend the time it takes to complete a refresh, in these cases it is convenient to disable it.

– If safe, Ignore the Privacy levels or set both sources to Organizational: This setting can potentially improve performance and functionality, however, Power BI Desktop cannot ensure the privacy of data merged into the Power BI Desktop file.

Don´t deactivate this option in the following scenarios:

  • If the data source contains sensitive data.
  • Queries that accept query folding like SQL.
  • If the dataset will be published.

– Use Query Diagnostics to analyze the query performance and identify the bottlenecks exist while loading and transforming your data and refreshing your data in Power Query.

Click here for part 2 of this series.

 

Recent Posts

Leave a Comment