Data TutorialsSQL Tips

Data tutorial: Using Window Functions

Posted by AJ Welch

What are Window Functions?

Window functions enable users to perform calculations against partitions (i.e. subgroups or sections) of a result set, typically a table or the results from another query. Unlike traditional aggregation functions, which return only a single value for each group defined in the query, window functions return a single value for each input row.

How are window functions helpful?

Window functions increase the efficiency and reduce the complexity of queries that analyze partitions (windows) of a data set by providing an alternative to more complex SQL concepts, e.g. derived queries. Common use cases include:

  • Ranking results within a specific window (e.g. per-group ranking)
  • Accessing data from another row in a specific window (e.g. period-over-period reporting)
  • Aggregation within a specific window (e.g. running totals):

How to use Window Functions

Window functions are initiated with the OVER clause, and are configured using three concepts:

  • window partition (PARTITION BY) - groups rows into partitions
  • window ordering (ORDER BY) - defines the order or sequence of rows within each window
  • window frame (ROWS) - defines the window by use of an offset from the specified row

Syntax Overview

function (expression) OVER
     ( [ PARTITION BY expression_list ]
       [ ORDER BY order_list ]
       [ ROWS frame_clause ])

Definitions

  • expression_list: comma separated list of expressions, typically column names
  • order_list: comma separated list of expressions, typically column names
  • frame_clause: define the offset using CURRENT ROW, _value_ PRECEDING, UNBOUNDED PRECEDING, _value_ FOLLOWING, UNBOUNDED FOLLOWING

Examples

Ranking: per-group ranking

SELECT
      campaign
    , month
    , total_cost
    , rank() OVER (PARTITION BY month ORDER BY month DESC, total_cost DESC) as "rank"
FROM Cost_by_Month
LIMIT 8

Previous Row: period-over-period analysis

SELECT
      campaign
    , month
    , total_cost
    , LAG(total_cost, 1) OVER (ORDER BY campaign, month) as "previous total cost"
FROM Cost_by_Month
WHERE campaign = 'TV'
LIMIT 4

Aggregation Within a Window: running totals

SELECT
    campaign, month, monthly_cost
  , sum(monthly_cost) OVER (PARTITION BY campaign ORDER BY month ROWS BETWEEN UNBOUNDED
    PRECEDING and CURRENT ROW) as "running cost total"
FROM Cost_by_Month

Documentation & Function List:

Please note that subtle differences in usage and syntax exist from database to database. For information specific to your implementation, please use the source-specific documentation provided below.

Postgres

Redshift

Oracle

SQLServer

BigQuery

Visual SQL Hero

Introducing Visual SQL

SQL may be the language of data, but not everyone can understand it. With our visual version of SQL, now anyone at your company can query data from almost any source—no coding required.

Learn about Visual SQL