Window functions compute values across a set of rows that are somehow related to the current row, without collapsing the result set. They help you compare, rank, or accumulate across rows while preserving individual detail.
Unlike aggregate functions (which group and collapse data), window functions add extra insight per row.
FUNCTION(...) OVER (
PARTITION BY ...
ORDER BY ...
[ROWS or RANGE BETWEEN ...]
)
Each clause plays a specific role:
PARTITION BY
β divides the dataset into groupsORDER BY
β defines how rows are sequenced within those groupsROWS BETWEEN
β defines which rows to include in each calculationThink of PARTITION BY
as creating mini subgroups within your data. Each group resets the window function logic.
CustomerID | OrderDate | Amount |
---|---|---|
C1 | 2023-01-01 | 100 |
C1 | 2023-01-10 | 200 |
C2 | 2023-01-01 | 300 |
C2 | 2023-01-15 | 400 |
SELECT CustomerID, OrderDate, Amount,
ROW_NUMBER() OVER (ORDER BY OrderDate) AS RowNum_All,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RowNum_ByCustomer
FROM Orders;
CustomerID | OrderDate | Amount | RowNum_All | RowNum_ByCustomer |
---|---|---|---|---|
C1 | 2023-01-01 | 100 | 1 | 1 |
C2 | 2023-01-01 | 300 | 2 | 1 |
C1 | 2023-01-10 | 200 | 3 | 2 |
C2 | 2023-01-15 | 400 | 4 | 2 |
π Notice how PARTITION BY resets the row numbering inside each group.
PARTITION BY CustomerID
restarts the function for each customerπ Use PARTITION BY
when you want results relative to a subgroup (e.g., a customerβs order rank).
ORDER BY
determines the sequence the function uses inside each partition.
ORDER BY OrderDate
means "oldest to newest"Changing the ORDER changes the meaning. It defines what βfirstβ, βlastβ, or βnextβ really means.
A frame clause controls which rows are visible to the window function for each row.
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
β running totalROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
β forward-looking windowROWS BETWEEN 2 PRECEDING AND CURRENT ROW
β moving average (3-row)OrderDate | Amount |
---|---|
2023-01-01 | 100 |
2023-01-05 | 200 |
2023-01-10 | 300 |
2023-01-15 | 400 |
SELECT OrderDate, Amount,
SUM(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal,
SUM(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS FutureTotal,
AVG(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg3
FROM Orders;
OrderDate | Amount | RunningTotal | FutureTotal | MovingAvg3 |
---|---|---|---|---|
2023-01-01 | 100 | 100 | 1000 | 100.0 |
2023-01-05 | 200 | 300 | 900 | 150.0 |
2023-01-10 | 300 | 600 | 700 | 200.0 |
2023-01-15 | 400 | 1000 | 400 | 300.0 |
ROWS
= includes a specific number of rows relative to the current rowRANGE
= includes rows based on values being equal to the current rowβs ORDER BY valueIf OrderDate
has duplicates:
OrderDate | Amount |
---|---|
2023-01-01 | 100 |
2023-01-01 | 200 |
2023-01-02 | 300 |
Using ROWS BETWEEN CURRENT ROW AND CURRENT ROW
only includes one row. Using RANGE BETWEEN CURRENT ROW AND CURRENT ROW
includes both rows with 2023-01-01
.
π Use RANGE
when you want to include all rows with matching ORDER BY values.
π ROWS
is more precise for counting.
π RANGE
is better for grouped value comparisons.
LAST_VALUE()
usually doesA Common Table Expression (CTE) is often necessary when you want to filter or manipulate the results of a window function.
You can't directly reference a window function result in the WHERE
clause or reuse it in the same SELECT β SQL doesn't allow that. But by putting it in a CTE, you can treat the computed column as a normal field.
Find customers with the top 10% order amounts:
WITH Ranked AS (
SELECT *,
PERCENT_RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS pr
FROM Orders
)
SELECT * FROM Ranked WHERE pr <= 0.10;
π The window function runs inside the CTE, and the outer query can safely filter on it.
ORDER BY
β RANK()
will return 1 for all rows. Without ordering, many functions treat all rows as ties.WHERE
after window logic β may exclude needed rows. You must use CTEs or subqueries to filter after computing window results.LAST_VALUE()
surprises. If you donβt specify a frame, you might get unexpected results like the current row instead of the actual last row.SUM()
without OVER()
eliminate row detail; window versions keep all rows.ROW_NUMBER()
to test sorting logicORDER BY
, even when optionalSELECT *
alongside your window to verify inputs