Distinct customers

Overview

This article explains how to find (from the table below) the number of distinct customers that bought something each day, whilst also accounting for the fact that some customers may have had multiple sales transactions per day.

Steps

The following steps are involved:

1. Create a query

  • Click inside the table in the Excel worksheet
  • Click the From Table/Range icon on the Data tab in Excel to create a new query. The Query Editor window will open.
  • Set the data type for the Date column by selecting it in the Query Editor and then, in the Data Type drop-down menu, select Date.

2. Remove duplicates

  • Select both the Date and the Customer columns in the Query Editor by clicking each of them with the Shift or Control key held down.
  • Next, click the Remove Duplicates button on the Home tab of the Query Editor toolbar, as shown below.

Once you have clicked the Remove Duplicates button, the table will be filtered so only the first row with each distinct combination of Date and Customer is returned, as shown below.

Notice how the table now has only 9 rows now, rather than 14.

3. Group data

You can now count the number of rows in the table for each date to find the number of distinct customers who bought something on that date.

  • Click the Date column to select it and then click the Group By button on the Home tab of the Query Editor toolbar. The Group By dialog will appear.
  • Change the text in the New column name box to read DistinctCustomers and ensure that in the Group By box the Date column is selected, and that in the Operation box, Count Rows is selected, as shown below.
  • Click OK to close the dialog.

You should now have a table showing the number of distinct customers that bought something for each date.

  • Ensure that the Load to worksheet box is checked and then click the Close & Load button to close the Query Editor.

The output of the query is shown below:

4. The ‘M’ code

M-Code

let
    //Load data from the Excel worksheet
    Source = Excel.CurrentWorkbook(){[Name="Customers"]}[Content],
   //Set the data type of the Date column to date
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Customer", type text}, {"Sales", Int64.Type}}),
    //Remove all duplicate combinations of Date and Customer
    #"Removed Duplicates" = Table.Distinct(#"Changed Type", {"Date", "Customer"}),
    //Find the count of rows per date
    #"Grouped Rows" = Table.Group(#"Removed Duplicates", {"Date"}, {{"DistinctCustomers", each Table.RowCount(_), Int64.Type}})
in
    #"Grouped Rows"