Sorting data buckets for display

Charts

It’s common for businesses to look at data grouped in buckets of different ranges. For example, you may want to see how many people are signing up for your site by a segment such as company size. If you used # of Employees as the x-axis you get charts that look like this:

Unbucketed chart

The x-axis is too granular to be useful. The fix is to create buckets of the data for easier understanding. The extra complexity with creating a bucket range though is that the x-axis then becomes a string of the numeric range like “25-50” or a label you’ve chosen for that range like “midsize”. In either case, the visualization won’t have the right information to sort the data:

String Bucket sorted chart

Here’s the raw data of the above chart that we’ll use for this example. You can see that because it’s a string it’s being sorted alphabetically, not numerically.

Company Size Signups
1-10 516
10000+ 46
1001-10000 117
101-250 276
11-25 342
251-1000 215
26-50 331
51-100 217

We can sort this out with a couple data pipeline steps and the flexibility Chartio has with enabling you to create new custom columns using any function available in SQLite. We’ll be taking advantage of the SQLite string functions and it’s cast function.

What we’re going to do is create a new column that is just the first number in the bucket range strings.

1. Create a new column

To start add a column in your data pipeline.

Add column

2. Parse the column and convert to an integer

Change it to a “custom formula” and paste in the following, replacing “company size” with the name of your bucket column.

cast(substr("company size", 0, instr(replace("company size", "+", "-"), '-')) as INT)

In short, what the above function is doing is getting only the first number (the parts before the “+” and the “-“) of the string and then casting it into an integer. If you’d like to understand the function further, take a look at our string function documentation and play around with running each part of the function independently.

3. Sort and hide

Now we have a new numeric column that is just the first value of our ranges, and we can add a sort step to sort by the numeric value.

Size sort

And finally, we don’t need to chart the numeric value so we can hide the column.

Result

The result is now a nicely sorted bucketed histogram.

Sorted bucket chart

If you have any troubles or question in getting this to work don’t hesitate to reach out at help@chartio.com.


Related Charts Help Articles

See more