Using Window Functions

Data Tutorial

What are Window Functions?

Window functions enable users to perform calculations against partitions (i.e. subgroups or sections) of a result set, typically a table or the results from another query. Unlike traditional aggregation functions, which return only a single value for each group defined in the query, window functions return a single value for each input row.

How are window functions helpful?

Window functions increase the efficiency and reduce the complexity of queries that analyze partitions (windows) of a data set by providing an alternative to more complex SQL concepts, e.g. derived queries. Common use cases include:

  • Ranking results within a specific window (e.g. per-group ranking)
  • Accessing data from another row in a specific window (e.g. period-over-period reporting)
  • Aggregation within a specific window (e.g. running totals):

How to use Window Functions

Window functions are initiated with the OVER clause, and are configured using three concepts:

  • window partition (PARTITION BY) - groups rows into partitions
  • window ordering (ORDER BY) - defines the order or sequence of rows within each window
  • window frame (ROWS) - defines the window by use of an offset from the specified row

Syntax Overview

function (expression) OVER
     ( [ PARTITION BY expression_list ]
       [ ORDER BY order_list ]
       [ ROWS frame_clause ])
Definitions
  • expression_list: comma separated list of expressions, typically column names
  • order_list: comma separated list of expressions, typically column names
  • frame_clause: define the offset using CURRENT ROW, _value_ PRECEDING, UNBOUNDED PRECEDING, _value_ FOLLOWING, UNBOUNDED FOLLOWING

Examples

Ranking: per-group ranking

SELECT
      campaign
    , month
    , total_cost
    , rank() OVER (PARTITION BY month ORDER BY month DESC, total_cost DESC) as "rank"
FROM Cost_by_Month
LIMIT 8

Previous Row: period-over-period analysis

SELECT
      campaign
    , month
    , total_cost
    , LAG(total_cost, 1) OVER (ORDER BY campaign, month) as "previous total cost"
FROM Cost_by_Month
WHERE campaign = 'TV'
LIMIT 4

Aggregation Within a Window: running totals

SELECT
    campaign, month, monthly_cost
  , sum(monthly_cost) OVER (PARTITION BY campaign ORDER BY month ROWS BETWEEN UNBOUNDED
    PRECEDING and CURRENT ROW) as "running cost total"
FROM Cost_by_Month

Documentation & Function List:

Please note that subtle differences in usage and syntax exist from database to database. For information specific to your implementation, please use the source-specific documentation provided below.

Postgres

Redshift

Oracle

SQLServer

BigQuery