There may be times where you need to count items within long strings, for example, to use as part of a formula or even to use as part of conditional formatting. Below are a few ways to count parts of strings in the Data Pipeline.
Count keywords in strings
You can easily count the number of instances a specific word is found in one of your columns.
For example, we’d like to count the number of rows in our “URL” column containing the word
test. We want to count any result containing this word in a row without excluding results that are not exact. To do this, we’ll use conditional logic with a CASE statement.
We will need to add a column in the Data Pipeline (here labeled ‘Test’) select custom formula as the Formula Type and enter the CASE statement below in the Formula field, replacing
URL with your column name and
test with your keyword:
case when "URL" like '%test%' then 1 else 0 end
Note: Make sure to wrap your keyword in wildcards (%) to make sure the format accounts for strings with your keyword at any place in the string.
You could then further aggregate the results of this column by adding a Group Rows step in the Data Pipeline to count all values in your new column.
It’s also possible to create a flag where every time the word
test is present in the “URL” column, the word
yes would appear. If this keyword isn’t present, the word
no would appear. This can also be done using a CASE statement in a Custom formula:
case when "URL" like '%test%' then 'yes' else 'no' end
Count all characters in strings
You can also count all the characters within a string by using the length() function. This information may then be useful to use as part of other formulas.
To do this, simply add a column in the Pipeline and select a custom formula. You’ll then enter the following formula, replacing
Column Name with your column name:
Note: Using this formula counts all characters in the string, including spaces.
If you’d like to exclude spaces from the character count, you can nest the replace function within the length function to remove any spaces from your string before you get the character count. You’d use the following formula, replacing
Column Name with the name of your column:
length((replace("Column Name",' ','')))