"Xorthax wanted to track each customer's total purchases over timeโwithout losing the detail of each individual transaction. That's where aggregate window functions shine."
These are traditional aggregate functions (like SUM()
or COUNT()
) used with the OVER()
clause so that they return a value for each row, instead of collapsing rows into groups.
You keep all your rows and get the aggregated insight across a window.
When you use an aggregate function like SUM()
or AVG()
in a window, you apply it across a moving or defined slice of dataโnot the whole table, and not a reduced group.
The full form looks like this:
SUM(column) OVER (
PARTITION BY grouping_column -- optional: split data into subgroups
ORDER BY ordering_column -- defines the row sequence in the window
ROWS BETWEEN ... -- optional: controls which rows are included
)
You can mix and match these pieces depending on what you're calculating:
PARTITION BY
for a single running total across all rowsROWS BETWEEN
for moving averages, rolling sums, etc.OrderID | CustomerID | OrderDate | Amount |
---|---|---|---|
1 | A1 | 2024-01-01 | 100 |
2 | A1 | 2024-01-05 | 150 |
3 | A1 | 2024-01-10 | 200 |
SELECT OrderID, Amount,
SUM(Amount) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RunningTotal
FROM table
OrderID | Amount | RunningTotal |
---|---|---|
1 | 100 | 100 |
2 | 150 | 250 |
3 | 200 | 450 |
SELECT OrderID, Amount,
AVG(Amount) OVER (
PARTITION BY CustomerID
ORDER BY OrderDate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS MovingAvg
FROM table
OrderID | Amount | MovingAvg | |
---|---|---|---|
1 | 100 | 100.0 | |
2 | 150 | 125.0 | |
3 | 200 | 150.0 |
Some phrases in a problem can be strong clues that an aggregate window function is appropriate:
"Running total"
"Cumulative"
"Rolling average"
"So far..." โ This implies a cumulative calculation. For example: "How much has the customer spent so far this month?" Thatโs a job for SUM()
with a window.
"As of this date..." โ Suggests you want a running or historical total up to a specific point in timeโperfect for SUM()
or MAX()
with ORDER BY
and possibly a frame clause.
"As of this date..."
ORDER BY
in the window clause when a running or rolling calculation is intended.GROUP BY
instead of OVER()
when row-level detail is needed.ROWS BETWEEN
for true running totals).A frame clause tells SQL which specific rows to include in the window when applying the aggregate function. It allows you to define a sliding or cumulative window relative to the current row.
By default, if you specify an ORDER BY
in a window function, SQL might apply a default frame like RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
โbut for full control, you can specify it explicitly using ROWS BETWEEN
or RANGE BETWEEN
.
AVG(Amount) OVER (
PARTITION BY CustomerID
ORDER BY OrderDate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)
๐ This defines a 3-row window: it includes the current row and the two preceding rows. That means:
This window "slides" forward row by row, so row 3 forward is always the last 3 rows.
SUM(Amount) OVER (
PARTITION BY CustomerID
ORDER BY OrderDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
๐ This frame includes all rows from the beginning of the partition up to and including the current row. It ensures each row receives a cumulative total that builds as the window slides forward.
This is useful when questions ask things like "total spent so far" or "balance as of this transaction."
Use CTEs to:
WITH RollingSums AS (
SELECT *, SUM(Amount) OVER (...) AS RunningTotal
FROM Orders
)
SELECT * FROM RollingSums WHERE RunningTotal > 500;
Windowed aggregates often work best when used alongside other window functions like ROW_NUMBER()
, RANK()
, or even another aggregate. Here are a couple of helpful combinations:
You can add both RANK()
and SUM()
in the same query to show how a user compares to others and what their running progress looks like:
SELECT
CustomerID,
OrderDate,
SUM(Amount) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RunningTotal,
RANK() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderRank
FROM Orders;
Want to know how far a row is from the highest value in its group? Combine with MAX()
:
SELECT
CustomerID,
OrderDate,
Amount,
MAX(Amount) OVER (PARTITION BY CustomerID) AS MaxAmount,
Amount - MAX(Amount) OVER (PARTITION BY CustomerID) AS BehindTop
FROM Orders;
These combinations give richer context for trends, outliers, or leaderboard-style logic. For example, the second query above shows how far each row's value is behind the top performer within its partitionโperfect for highlighting gaps or performance deltas.
ROW_NUMBER()
to see the order.Use Case | GROUP BY โ | Aggregate Window Function โ |
---|---|---|
Total per customer | โ | โ |
Keep each row | โ | โ |
Running total | โ | โ |
(You might also like: RANK, ROW_NUMBER, NTILE, Frame Clauses)