Window functions
Description
A window function call represents the application of an aggregate-like function over some portion of the rows selected by a query. Unlike non-window aggregate calls
Example
this is not tied to grouping of the selected rows into a single output row — each row remains separate in the query output. However the window function has access to all the rows that would be part of the current row’s group according to the grouping specification (PARTITION BY list) of the window function call. The syntax of a window function call is one of the following:
function_name ([expression [
expression … ]]) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ([expression [
expression … ]]) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
where window_definition has the syntax
[ existing_window_name ]
[ PARTITION BY expression [
…] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [
…] ]
[ frame_clause ]
and the optional frame_clause can be one of
{ RANGE | ROWS } frame_start { RANGE | ROWS } BETWEEN frame_start AND frame_end
where frame_start and frame_end can be one of
UNBOUNDED PRECEDING value PRECEDING CURRENT ROW value FOLLOWING UNBOUNDED FOLLOWING
Here
expression represents any value expression that does not itself contain window function calls.
SELECT * FROM employees; last_name salary department Jones 45000 Accounting Adams 50000 Sales Johnson 40000 Marketing Williams 37000 Accounting Smith 55000 Sales
Lets assume that you wanted to find the highest paid person in each department. There’s a chance you could do this by creating a complicated stored procedure
or maybe even some very complex SQL. Most developers would even opt for pulling the data back into their preferred language and then looping over results. With window functions this gets much easier.
First we can rank each individual over a certain grouping:
=> SELECT last_name
salary
department
rank() OVER (
PARTITION BY department
ORDER BY salary
DESC
)
FROM employees;
last_name | salary | department | rank
———-+——–+————+—–
Jones | 45000 | Accounting | 1
Williams | 37000 | Accounting | 2
Smith | 55000 | Sales | 1
Adams | 50000 | Sales | 2
Johnson | 40000 | Marketing | 1
Hopefully its clear from here how we can filter and find only the top paid employee in each department:
=> SELECT *
FROM (
SELECT
last_name
salary
department
rank() OVER (
PARTITION BY department
ORDER BY salary
DESC
)
FROM employees) sub_query
WHERE rank = 1;
last_name | salary | department | rank
———-+——–+————-+—–
Jones | 45000 | Accounting | 1
Smith | 55000 | Sales | 1
Johnson | 40000 | Marketing | 1
window_name is a reference to a named window specification defined in the query’s WINDOW clause. Alternatively
a full window_definition can be given within parentheses
using the same syntax as for defining a named window in the WINDOW clause; see the SELECT reference page for details. It’s worth pointing out that OVER wname is not exactly equivalent to OVER (wname …); the latter implies copying and modifying the window definition
and will be rejected if the referenced window specification includes a frame clause.
The PARTITION BY clause groups the rows of the query into partitions
which are processed separately by the window function. PARTITION BY works similarly to a query-level GROUP BY clause
except that its expressions are always just expressions and cannot be output-column names or numbers. Without PARTITION BY
all rows produced by the query are treated as a single partition. The ORDER BY clause determines the order in which the rows of a partition are processed by the window function. It works similarly to a query-level ORDER BY clause
but likewise cannot use output-column names or numbers. Without ORDER BY
rows are processed in an unspecified order.
The frame_clause specifies the set of rows constituting the window frame
which is a subset of the current partition
for those window functions that act on the frame instead of the whole partition. The frame can be specified in either RANGE or ROWS mode; in either case
it runs from the frame_start to the frame_end. If frame_end is omitted
it defaults to CURRENT ROW.
A frame_start of UNBOUNDED PRECEDING means that the frame starts with the first row of the partition
and similarly a frame_end of UNBOUNDED FOLLOWING means that the frame ends with the last row of the partition.
In RANGE mode
a frame_start of CURRENT ROW means the frame starts with the current row’s first peer row (a row that ORDER BY considers equivalent to the current row)
while a frame_end of CURRENT ROW means the frame ends with the last equivalent ORDER BY peer. In ROWS mode
CURRENT ROW simply means the current row.
The value PRECEDING and value FOLLOWING cases are currently only allowed in ROWS mode. They indicate that the frame starts or ends the specified number of rows before or after the current row. value must be an integer expression not containing any variables
aggregate functions
or window functions. The value must not be null or negative; but it can be zero
which just selects the current row.
The default framing option is RANGE UNBOUNDED PRECEDING
which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With ORDER BY
this sets the frame to be all rows from the partition start up through the current row’s last ORDER BY peer. Without ORDER BY
all rows of the partition are included in the window frame
since all rows become peers of the current row.
Restrictions are that frame_start cannot be UNBOUNDED FOLLOWING
frame_end cannot be UNBOUNDED PRECEDING
and the frame_end choice cannot appear earlier in the above list than the frame_start choice — for example RANGE BETWEEN CURRENT ROW AND value PRECEDING is not allowed.
If FILTER is specified
then only the input rows for which the filter_clause evaluates to true are fed to the window function; other rows are discarded. Only window functions that are aggregates accept a FILTER clause.
Other window functions can be added by the user. Also
any built-in or user-defined general-purpose or statistical aggregate can be used as a window function. (Ordered-set and hypothetical-set aggregates cannot presently be used as window functions.)
The syntaxes using * are used for calling parameter-less aggregate functions as window functions
for example count(*) OVER (PARTITION BY x ORDER BY y). The asterisk (*) is customarily not used for window-specific functions. Window-specific functions do not allow DISTINCT or ORDER BY to be used within the function argument list.
Window function calls are permitted only in the SELECT list and the ORDER BY clause of the query.