Query Folding

Overview

In a typical ETL process, when the data source is a file or files e.g. a csv(s), Power Query will read the data from the file, and perform any transformation tasks inside Power Query itself. This is fine for most relatively simple tasks (e.g. when removing or renaming columns) and relatively small files (say less than 1 million rows).

But bad things can happen when you try to load tens of millions of rows using Power Query from a csv file and then try to preform ‘heavy’ tasks, such as sorting over the data. To sort 10 million rows from a csv, Power Query would need to load every record into memory and then complete the sort task from there. Whilst it is certainly possible to do this, it is not always the best approach.

Query Folding

Inside Power Query, the term query folding refers to the process of taking complex tasks generated by the Power Query UI (such as sorting 10 million rows of data) and pushing those tasks back to the database for execution.

Power Query will automatically execute Query Folding, but only under certain conditions. These include:

  1. The source needing to be a database that can accept a folding request. Most relational databases are included here, but this also extends to OData sources, SSAS, and services such as Google Analytics
  2. The ODBC / ODBA driver being used, supports query folding. (Some don’t!)
  3. Starting a query with SQL code written by yourself (known as a native query). If this is done, no later steps inside Power Query will be folded.
  4. Performing a complex power query step that cannot be folded. If this happens, Query Folding will stop for the rest of the query.

Query Folding active?

Method 1: Native query

When you use Power Query, the ETL steps are listed in the right side “applied steps” panel. You can check if a query step has generated Query Folding by checking for the existence of Native Query code inside that step.

To do this, right-click on one of the query steps and then select “view Native Query”. If the option is not greyed out, i.e. if view Native Query is enabled, it means Power Query has successfully performed Query Folding. It the option is greyed out, it means Query Folding has stopped

Method 2: Metadata

Another, more advanced way of telling if a query is performing Query Folding, is to go to the formula bar and wrap your step code with the function Value.Metadata, for example:

M-Code
    = Value.Metadata(#”Removed Other Columns”)

The result of this will give you a record. If the record has the “Query Folding” field, you should be able to navigate through that record to see what’s inside (You should be able to see an IsFolded parameter, set to TRUE, to know whether Query Folding is active).