There may be times where you need to count items within long strings, for example, to use as part of a formula or even to use as part of conditional formatting. Below are a few ways to count parts of strings in the Data Pipeline.
Count keywords in strings
You can easily count the number of instances a specific word is found in one of your columns.
For example, we would like to count the number of rows in our ‘URL’ column containing the word “test”. We want to count any result containing this word in a row without excluding results that are not exact. To do this, we will use conditional logic with a case statement.
We will need to add a column in the Data Pipeline (here labeled ‘Test’) select custom formula as the Formula Type and enter the case statement below in the Formula field, replacing URL with your column name and test with your keyword:
case when "URL" like '%test%' then 1 else 0 end
Note: Make sure to wrap your keyword in wildcards (%) to make sure the format accounts for strings with your keyword at any place in the string.
You could then further aggregate the results of this column by adding a Group Rows step in the Data Pipeline to count all values in your new column.
It’s also possible to create a flag where every time the word “test” is present in the ‘URL’ column, the word “yes” would appear. If this keyword isn’t present, the word “no” would appear. This can be done with a custom formula such as:
case when "URL" like '%test%' then 'yes' else 'no' end
Count all characters in strings
You can also count all the characters within a string by using the length() function. This information may then be useful to use as part of other formulas.
To do this, simply add a column in the Pipeline and select a custom formula. You’ll then want to enter the following into the Formula field, replacing Column Name with your column name:
Note: Using this formula counts all characters in the string, including spaces.
If you’d like to count all characters within a string without spaces, you can combine a replace() function within your length() function. Doing this will ensure we first remove any spaces from your string and then calculate the length. The formula will look as follows, replacing Column Name with the name of your column:
length((replace("Column Name",' ','')))