Window functions, finally explained without the jargon
GROUP BY collapses rows. Window functions keep them. Once that one distinction clicks, ranking, running totals, and 'top N per group' get easy.
A window function runs an aggregate like GROUP BY does, but keeps every row instead of collapsing them — attaching the result alongside each row. People find window functions intimidating because of one misconception: that they're a fancier GROUP BY. They're not. The difference is simple, and once it lands the rest is vocabulary.
The one thing to remember: GROUP BY collapses rows into one per group; a window function keeps all rows and adds the computed value to each.
GROUP BY collapses. Windows don't.
GROUP BY takes many rows and returns one per group. A window function runs the same kind of aggregate but keeps every row and attaches the result:
-- GROUP BY: one row per department.
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
-- Window: every employee row, plus their department average.
SELECT
name,
department,
salary,
AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;The OVER (...) clause is what makes it a window. PARTITION BY is the window's version of grouping — it defines which rows each calculation sees — but the original rows survive.
The three you'll use constantly
Most real work uses just three window functions:
ROW_NUMBER()— a unique sequential number per partition.RANK()— like row number, but ties share a rank and leave gaps.SUM(...) OVER (ORDER BY ...)— a running total.
The difference between ROW_NUMBER() and RANK() is the classic gotcha: with two people tied for first, ROW_NUMBER() still assigns 1 and 2, while RANK() assigns 1 and 1, then skips to 3.
"Top N per group" — the classic
This is the question window functions were born for: the highest-paid employee in each department.
SELECT name, department, salary
FROM (
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS rn
FROM employees
) ranked
WHERE rn = 1;Partition by the group, order by what you're ranking, then filter to rn = 1. Swap = 1 for <= 3 and you've got top 3 per department — the same pattern, one character changed.
Why this is worth drilling
Once the collapse-vs-keep distinction lands, ranking, running totals, gaps-and-islands, and period-over-period comparisons all open up — they're variations on the same OVER (...) idea. It's one of the highest-leverage SQL concepts for data and analytics interviews because so many real questions reduce to "rank within a group" or "running total over an order."
The pandas equivalent of "keep every row but compute per group" is transform — see pandas groupby that clicks. And the ranked subquery above is exactly the kind of step that reads better as a CTE; CTEs vs subqueries covers when to make that swap.
How this shows up on CodeOak
CodeOak's SQL query track is output-graded: you're scored on the table your query returns, compared deterministically against the expected result — no AI judging your code. Window functions are a recurring focus because they map so cleanly to that format: get the partition and order right and the table is correct. Take the assessment and the roadmap will surface them when they're your gap.
FAQ
What is the difference between a window function and GROUP BY?
GROUP BY collapses many rows into one per group. A window function performs a similar calculation but keeps every original row, attaching the computed value to each via an OVER (...) clause. Use a window when you need per-row detail and the group-level value together.
What does PARTITION BY do in a window function?
PARTITION BY defines the set of rows each window calculation considers — it's the window's equivalent of grouping. Unlike GROUP BY, it doesn't reduce the row count; it just scopes the calculation to rows that share the partition value.
How do I get the top N rows per group in SQL?
Add ROW_NUMBER() OVER (PARTITION BY group ORDER BY metric DESC) in a subquery or CTE, then filter the outer query to rn <= N. Use = 1 for the single top row per group, or <= 3 for the top three.