Window functions

Description

A window function call represents the application of an aggregate-like function over some portion of the rows selected by a query. Unlike non-window aggregate calls

Example

this is not tied to grouping of the selected rows into a single output row — each row remains separate in the query output. However the window function has access to all the rows that would be part of the current row’s group according to the grouping specification (PARTITION BY list) of the window function call. The syntax of a window function call is one of the following:

function_name ([expression [

expression … ]]) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ([expression [

expression … ]]) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )

where window_definition has the syntax

[ existing_window_name ]
[ PARTITION BY expression [

…] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [

…] ]
[ frame_clause ]

and the optional frame_clause can be one of

{ RANGE | ROWS } frame_start
{ RANGE | ROWS } BETWEEN frame_start AND frame_end

where frame_start and frame_end can be one of

UNBOUNDED PRECEDING
value PRECEDING
CURRENT ROW
value FOLLOWING
UNBOUNDED FOLLOWING

Here

expression represents any value expression that does not itself contain window function calls.

SELECT * FROM employees;
last_name   salary   department
Jones       45000    Accounting
Adams       50000    Sales
Johnson     40000    Marketing
Williams    37000    Accounting
Smith       55000    Sales

Lets assume that you wanted to find the highest paid person in each department. There’s a chance you could do this by creating a complicated stored procedure

or maybe even some very complex SQL. Most developers would even opt for pulling the data back into their preferred language and then looping over results. With window functions this gets much easier.

First we can rank each individual over a certain grouping:

=> SELECT last_name

salary

department

rank() OVER (
PARTITION BY department
ORDER BY salary
DESC
)
FROM employees;
last_name | salary | department | rank
———-+——–+————+—–
Jones | 45000 | Accounting | 1
Williams | 37000 | Accounting | 2
Smith | 55000 | Sales | 1
Adams | 50000 | Sales | 2
Johnson | 40000 | Marketing | 1

Hopefully its clear from here how we can filter and find only the top paid employee in each department:

=> SELECT *
FROM (
SELECT
last_name

salary

department

rank() OVER (
PARTITION BY department
ORDER BY salary
DESC
)
FROM employees) sub_query
WHERE rank = 1;
last_name | salary | department | rank
———-+——–+————-+—–
Jones | 45000 | Accounting | 1
Smith | 55000 | Sales | 1
Johnson | 40000 | Marketing | 1

window_name is a reference to a named window specification defined in the query’s WINDOW clause. Alternatively

a full window_definition can be given within parentheses

using the same syntax as for defining a named window in the WINDOW clause; see the SELECT reference page for details. It’s worth pointing out that OVER wname is not exactly equivalent to OVER (wname …); the latter implies copying and modifying the window definition

and will be rejected if the referenced window specification includes a frame clause.

The PARTITION BY clause groups the rows of the query into partitions

which are processed separately by the window function. PARTITION BY works similarly to a query-level GROUP BY clause

except that its expressions are always just expressions and cannot be output-column names or numbers. Without PARTITION BY

all rows produced by the query are treated as a single partition. The ORDER BY clause determines the order in which the rows of a partition are processed by the window function. It works similarly to a query-level ORDER BY clause

but likewise cannot use output-column names or numbers. Without ORDER BY

rows are processed in an unspecified order.

The frame_clause specifies the set of rows constituting the window frame

which is a subset of the current partition

for those window functions that act on the frame instead of the whole partition. The frame can be specified in either RANGE or ROWS mode; in either case

it runs from the frame_start to the frame_end. If frame_end is omitted

it defaults to CURRENT ROW.

A frame_start of UNBOUNDED PRECEDING means that the frame starts with the first row of the partition

and similarly a frame_end of UNBOUNDED FOLLOWING means that the frame ends with the last row of the partition.

In RANGE mode

a frame_start of CURRENT ROW means the frame starts with the current row’s first peer row (a row that ORDER BY considers equivalent to the current row)

while a frame_end of CURRENT ROW means the frame ends with the last equivalent ORDER BY peer. In ROWS mode

CURRENT ROW simply means the current row.

The value PRECEDING and value FOLLOWING cases are currently only allowed in ROWS mode. They indicate that the frame starts or ends the specified number of rows before or after the current row. value must be an integer expression not containing any variables

aggregate functions

or window functions. The value must not be null or negative; but it can be zero

which just selects the current row.

The default framing option is RANGE UNBOUNDED PRECEDING

which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With ORDER BY

this sets the frame to be all rows from the partition start up through the current row’s last ORDER BY peer. Without ORDER BY

all rows of the partition are included in the window frame

since all rows become peers of the current row.

Restrictions are that frame_start cannot be UNBOUNDED FOLLOWING

frame_end cannot be UNBOUNDED PRECEDING

and the frame_end choice cannot appear earlier in the above list than the frame_start choice — for example RANGE BETWEEN CURRENT ROW AND value PRECEDING is not allowed.

If FILTER is specified

then only the input rows for which the filter_clause evaluates to true are fed to the window function; other rows are discarded. Only window functions that are aggregates accept a FILTER clause.

Other window functions can be added by the user. Also

any built-in or user-defined general-purpose or statistical aggregate can be used as a window function. (Ordered-set and hypothetical-set aggregates cannot presently be used as window functions.)

The syntaxes using * are used for calling parameter-less aggregate functions as window functions

for example count(*) OVER (PARTITION BY x ORDER BY y). The asterisk (*) is customarily not used for window-specific functions. Window-specific functions do not allow DISTINCT or ORDER BY to be used within the function argument list.

Window function calls are permitted only in the SELECT list and the ORDER BY clause of the query.