Grouping a list of email addresses by domain is super easy in the Data Pipeline.
- In our example, we want to get the count of events grouped by email domain. So we’ll drag email into Dimensions, and set count of event_id as our Measure.
- Next, we’ll add a column in the Data Pipeline called “Domain” that extracts the domain from the email address. This is possible with SQLite’s substr() function. This is our formula:
substr(“Email Column Name”,instr(“Email Column Name”, ‘@’) + 1)
- Next, we’ll hide the “Email” column, as we don’t need it anymore. Add a Hide Columns step in the Pipeline and select the “Email” column from the dropdown.
- Then, add a Reorder Columns step to place Domain first, and Count second.
- Finally, we want to group our values by email domain. Add a Group step in the Pipeline. You’ll want to group by the “Domain” column, and select the SUM aggregation for your count column.
Here’s what our query and final table results look like:
Click to expand.