Creating a Generalisable function

Guides • Data Integration & ETL • 23/06/2019

Overview

This article explains how to turn the ETL actions explained here: Generalised unpivoting, into one re-usable generalizable function that can be used on any pivoted summary table.

This is useful, if for example, you have different reports, each with a different summary table of: M x N dimensions and hierarchies.

Using a function here means that instead of going through the preceding sequence of steps manually for each type of summary table, you can aggregate all steps into a single transformation step!

Here is how you would do it:

Steps

1. Define initial queries

Create a “RowFields” query

Create a corresponding “ColumnFields” query

Define the name of the “ValueField” – below, this has been given the value: ‘Revenue’

2. Bring in the source data

3. Write some ‘M’ code

In a new query, write the following ‘M’ code:

M-Code


    (Source, RowFields, ColumnFields, ValueField)=>
let
    #"Filled Down" = Table.FillDown(Source, List.FirstN(Table.ColumnNames(Source), List.Count(RowFields) - 1)),
    #"Merged Columns" = Table.CombineColumns(#"Filled Down", List.FirstN(Table.ColumnNames(#"Filled Down"), List.Count(RowFields)), Combiner.CombineTextByDelimiter(":", QuoteStyle.None),"Merged"),
    #"Transposed Table" = Table.Transpose(#"Merged Columns"),
    #"Filled Down1" = Table.FillDown(#"Transposed Table",List.FirstN(Table.ColumnNames(#"Transposed Table"), List.Count(ColumnFields)-1)),
    #"Promoted Headers" = Table.PromoteHeaders(#"Filled Down1", [PromoteAllScalars=true]),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Promoted Headers",List.FirstN(Table.ColumnNames(#"Promoted Headers"), List.Count(ColumnFields)) , "Attribute", ValueField),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), RowFields),
    #"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter", List.Zip({List.FirstN(Table.ColumnNames(#"Split Column by Delimiter"), List.Count(ColumnFields)), ColumnFields})),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{ValueField, type number}})
in
    #"Changed Type"

The above code takes in the parameters from: Source, RowFields, ColumnFields, and ValueField - as defined from the earlier queries, and runs them through a series of ETL steps that can applied to a very wide variety of sparse pivoted tables.

Important parts of the code are elaborated further below:

Fill Down:

M-Code


    #"Filled Down" = Table.FillDown(Source, List.FirstN(Table.ColumnNames(Source), List.Count(RowFields) - 1)),

Table.FillDown takes its first argument as the Table, and the second argument as a list of column names.

To turn this into a dynamic list, you first need to get a list of all column names in the Table. This is what the function Table.ColumnNames does.

To return only N-1 columns, you use the List.Count function, which in the above line of code, counts the number of elements, and deducts 1.

Merge Columns:

M-Code


    #"Merged Columns" = Table.CombineColumns(#"Filled Down", List.FirstN(Table.ColumnNames(#"Filled Down"), List.Count(RowFields)), Combiner.CombineTextByDelimiter(":", QuoteStyle.None),"Merged"),

This line dynamically refers back to the N column names of the source table, and uses N as the second argument of the List.FirstN function (as you need to combine the first N columns, not just N-1 columns)

Fill Down 2:

M-Code


    #"Filled Down1" = Table.FillDown(#"Transposed Table",List.FirstN(Table.ColumnNames(#"Transposed Table"), List.Count(ColumnFields)-1)),

Unlike the modified Table.FillDown, from earlier, which used List.Count, on RowFields, this uses List.Count on ColumnFields to get the number M-1

Unpivot:

M-Code


    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Promoted Headers",List.FirstN(Table.ColumnNames(#"Promoted Headers"), List.Count(ColumnFields)) , "Attribute", ValueField),

Here, M instead of M-1 are unpivoted. Table.ColumnNames is applied to the table #”Promoted Headers”, which represents the state of the table in the previous transformation step.

Here, the query ValueField instead of “Value” is included as the new name of the Value column

Split Column:

M-Code


    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), RowFields),

The third argument of Table.SplitColumn takes in two important elements: the number of columns to split, and the names of the columns to split. These are already defined in the RowFields query, which is why this is used as the last parameter.

Renaming Columns & List.Zip:

M-Code


    #"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter", List.Zip({List.FirstN(Table.ColumnNames(#"Split Column by Delimiter"), List.Count(ColumnFields)), ColumnFields})),

This line is probably the most complicated step.

The code renames the columns by passing a “list of lists” as the second argument of Table.RenameColumns

The RowField and Value columns have already been renamed correctly from the earlier steps, so only the first M columns need to be renamed. So, to obtain the new names, you can use the ColumnFields query

What List.Zip does is get two lists as inputs, and returns a single list of lists, as pairs from both lists.

For example:

List.Zip ({“A”, “B”, “C”},{“1”,”2”,”3”}),

will return the following list:

{{“A”,”1”}, {“B”,”2”}, {“C”,”3”}}

The above code therefore gets the names of the first M columns, as they are currently named, and replaces them with what they are called in the query, ColumnFields.

Changed Type:

M-Code


    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{ValueField, type number}})

To explicitly change the type of Revenue column to a Decimal Number, use the above code. The use of the ValueField parameter ensures the column is renamed to “Revenue”. Type Number ensures it is of type Decimal Number

4. Apply the function to the Source Data

With the function built, you can now apply it to the Source Data query.

Create a new query, called “Results” – as follows:

Final Comments

Here, we have seen how to handle complex summary tables with any number of hierarchy of rows and columns, using functions such as Table.ColumnNames, List.FirstN, or List.Zip.

This gives the ability to make queries very flexible, so they can handle future scenarios and scale up to a variety of source table structures.

Creating a Generalisable function

Overview

Steps

1. Define initial queries

2. Bring in the source data

3. Write some ‘M’ code

4. Apply the function to the Source Data

Final Comments

Feedback