Aggregate Functions

Learn SQL

Fetching the raw data is nice and all, but now we’re going to start actually doing some aggregations and transformations to it! The first and probably most commonly used aggregation function we are going to learn is COUNT. The COUNT function takes whatever you give it and returns the count of how many there are.

The following SQL will count how many albums are in our database. Put another way, we’re going to query for a count of the number of rows are in the ablums table. Play around yourself and find how many are in the artists and tracks tables as well.

COUNT of specific columns

As the COUNT goes over the data results it only increments its tally if the data is not NULL (NULL means empty in SQL jargon). Doing a COUNT(*) will always return the full count of the number of rows that exist in the table as the splat (*) represents each column and there’s no way that all the columns in the row would be NULL. If you specify a specific column however, you’re returning the COUNT of the number of rows where that column is not NULL. So doing a count of the composer column:

Will be less than the COUNT(*)

because the composer column has some NULL values (aka. it’s empty sometimes).

COUNT DISTINCT

A commonly used clause with the count function is DISTINCT. The DISTINCT clause changes the count to only tally the number of unique values in the data. Above we fetched how many tracks had composers listed. If we actually wanted to see how many unique composers were in our tracks table we could use the COUNT with the DISTINCT clause as shown here:

Can you modify the query above to find how many different genre_ids are the tracks table?

Aliases

A quick aside here: Notice that the column headers on the above datasets weren’t all that clear. SQL does a okay job of finding a name for what you’re fetching but often, especially as we start making more complex functions, you’ll want to use your own alias for the data. You can do so with the AS key word following your selections:

Be sure to use double quotes (“) around your Aliases as double quotes are used for column titles.

Functions

The following is a list of the most commonly used functions in SQL. They work similar to COUNT but perform different calculations on the data.

Function Description
MAX returns the largest (maximum) number in a sets
MIN described
COUNT returns a count of the # of values in a set
COUNT DISTINCT returns a count of the # of unique (distinct) values in a set
EVERY returns true if all data inside is true (same as bool_and)
AVG returns the average (mean) of the set of numbers
SUM returns the sum of all the values in the set

The following example gives the range and average prices of the tracks using the MIN, MAX and AVG functions.

Can you modify the above query to return how much it would cost to buy one of every track in the database?

We only covered the most commonly used aggregation functions here. If you’d like to see more checkout the full list of PostgreSQL Functions


If you're finding this SQL Tutorial helpful, you may also enjoy our product Chartio. Chartio is on a mission to enable anyone in a company with the power of a data analyst. To do so we've built a very user friendly interface for people to explore and visualize data with either SQL, or our drag and drop Visual SQL. You can connect to all of your different data sources, and quickly create visuals and dashboards to share with your team. Give us a spin, there's a 2 week free trial.