This SQL tutorial explains how SQL OVER PARTITION BY works โ showing how OVER and PARTITION BY combine to rank rows, calculate running totals, and analyze data without grouping it away. Learn window functions through simple examples and clear explanations.
By the end, youโll understand how to use SQL OVER PARTITION BY to write powerful analytic queries for interviews and real-world data work.
Window functions compute values across a set of rows that are somehow related to the current row, without collapsing the result set. They help you compare, rank, or accumulate across rows while preserving individual detail.
Unlike aggregate functions (which group and collapse data), window functions add extra insight per row.
FUNCTION(...) OVER (
PARTITION BY ...
ORDER BY ...
[ROWS or RANGE BETWEEN ...]
)
Each clause plays a specific role:
OVER() โ tells SQL that this is a window function. It defines the โwindowโ of rows the function can see and is required for all SQL window functions.PARTITION BY โ divides the dataset into groupsORDER BY โ defines how rows are sequenced within those groupsROWS BETWEEN โ defines which rows to include in each calculationThis section explains SQL OVER PARTITION BY in detail โ one of the most important building blocks of any window function.
Think of PARTITION BY as creating mini subgroups within your data. Each group resets the window function logic.
| CustomerID | OrderDate | Amount |
|---|---|---|
| C1 | 2023-01-01 | 100 |
| C1 | 2023-01-10 | 200 |
| C2 | 2023-01-01 | 300 |
| C2 | 2023-01-15 | 400 |
SELECT CustomerID, OrderDate, Amount,
ROW_NUMBER() OVER (ORDER BY OrderDate) AS RowNum_All,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RowNum_ByCustomer
FROM Orders;
| CustomerID | OrderDate | Amount | RowNum_All | RowNum_ByCustomer |
|---|---|---|---|---|
| C1 | 2023-01-01 | 100 | 1 | 1 |
| C2 | 2023-01-01 | 300 | 2 | 1 |
| C1 | 2023-01-10 | 200 | 3 | 2 |
| C2 | 2023-01-15 | 400 | 4 | 2 |
PARTITION BY CustomerID restarts the function for each customer๐ Use PARTITION BY when you want results relative to a subgroup (e.g., a customerโs order rank).
ORDER BY determines the sequence the function uses inside each partition.
ORDER BY OrderDate means "oldest to newest"Changing the ORDER changes the meaning. It defines what โfirstโ, โlastโ, or โnextโ really means.
A frame clause controls which rows are visible to the window function for each row.
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW โ running totalROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING โ forward-looking windowROWS BETWEEN 2 PRECEDING AND CURRENT ROW โ moving average (3-row)Note: In most SQL databases, writing
SUM(amount) OVER (ORDER BY order_date)
already produces a running total โ because the default frame is
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
Including the frame explicitly just makes your intent clearer and keeps your query portable across systems.
| OrderDate | Amount |
|---|---|
| 2023-01-01 | 100 |
| 2023-01-05 | 200 |
| 2023-01-10 | 300 |
| 2023-01-15 | 400 |
SELECT OrderDate, Amount,
SUM(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal,
SUM(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS FutureTotal,
AVG(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg3
FROM Orders;
| OrderDate | Amount | RunningTotal | FutureTotal | MovingAvg3 |
|---|---|---|---|---|
| 2023-01-01 | 100 | 100 | 1000 | 100.0 |
| 2023-01-05 | 200 | 300 | 900 | 150.0 |
| 2023-01-10 | 300 | 600 | 700 | 200.0 |
| 2023-01-15 | 400 | 1000 | 400 | 300.0 |
ROWS = includes a specific number of rows relative to the current rowRANGE = includes rows based on values being equal to the current rowโs ORDER BY valueIf OrderDate has duplicates:
| OrderDate | Amount |
|---|---|
| 2023-01-01 | 100 |
| 2023-01-01 | 200 |
| 2023-01-02 | 300 |
Using ROWS BETWEEN CURRENT ROW AND CURRENT ROW only includes one row. Using RANGE BETWEEN CURRENT ROW AND CURRENT ROW includes both rows with 2023-01-01.
๐ Use RANGE when you want to include all rows with matching ORDER BY values.
๐ ROWS is more precise for counting.
๐ RANGE is better for grouped value comparisons.
LAST_VALUE() usually doesA Common Table Expression (CTE) is often necessary when you want to filter or manipulate the results of a window function.
You can't directly reference a window function result in the WHERE clause or reuse it in the same SELECT โ SQL doesn't allow that. But by putting it in a CTE, you can treat the computed column as a normal field.
Find customers with the top 10% order amounts:
WITH Ranked AS (
SELECT *,
PERCENT_RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS pr
FROM Orders
)
SELECT * FROM Ranked WHERE pr <= 0.10;
๐ The window function runs inside the CTE, and the outer query can safely filter on it.
ORDER BY โ RANK() will return 1 for all rows. Without ordering, many functions treat all rows as ties.WHERE after window logic โ may exclude needed rows. You must use CTEs or subqueries to filter after computing window results.LAST_VALUE() surprises. If you donโt specify a frame, you might get unexpected results like the current row instead of the actual last row.SUM() without OVER() eliminate row detail; window versions keep all rows.ROW_NUMBER() to test sorting logicORDER BY, even when optionalSELECT * alongside your window to verify inputs