A case statement allows you to set column values when they fit the parameters you specify. The general formula is:
CASE WHEN "Column Name" (condition) THEN 'value' WHEN "Column Name" (condition2) THEN 'value2' ELSE "Column Name" END
In this example, we are getting the sum of the Engagement Index, grouped by Activity Name. This is what the Data Preview looks like:
We want to add a new column based on Engagement Index, where if Engagement Index is greater than 50,000, the new column should contain the string “High Engagement” for that row. If the sum is less than 50,000, we want the new column to contain the string “Low Engagement” for that row.
Start by adding an Add Column step in the Data Pipeline.
Call the column Engagement Group, and put the following case statement in the Formula section:
CASE WHEN "Engagement Index" > 50000 THEN 'High Engagement' WHEN "Engagement Index" < 50000 THEN "Low Engagement" ELSE "Engagement Index" END
Here is what the Add Column step looks like, along with a data preview after the step is applied.
Using a case statement to rename existing column values
If you are using a case statement to rename existing column values, you will need to hide the original column and reorder your columns.
- Add a Hide Columns step to hide the original column.
- Add a Reorder Columns step to reorder the new column so it is in the same place as the original.