3 Window Functions Explained
Window functions in SQL are powerful tools that allow you to perform calculations across a set of table rows that are somehow related to the current row. Unlike aggregate functions, window functions do not cause rows to become grouped into a single output row. Instead, the rows retain their separate identities, making them ideal for complex data analysis tasks.
Key Concepts
- OVER Clause
- PARTITION BY Clause
- ORDER BY Clause
- Frame Clause
- Common Window Functions
1. OVER Clause
The OVER
clause is used to define a window of rows around the current row. This window can be used with aggregate functions to perform calculations without collapsing the result set into a single row.
Example:
SELECT EmployeeID, Department, Salary, AVG(Salary) OVER () AS AvgSalary FROM Employees;
In this example, the AVG(Salary) OVER ()
calculates the average salary across all rows without grouping them.
2. PARTITION BY Clause
The PARTITION BY
clause is used within the OVER
clause to divide the result set into partitions. The window function is then applied to each partition separately.
Example:
SELECT EmployeeID, Department, Salary, AVG(Salary) OVER (PARTITION BY Department) AS DeptAvgSalary FROM Employees;
Here, the AVG(Salary) OVER (PARTITION BY Department)
calculates the average salary for each department separately.
3. ORDER BY Clause
The ORDER BY
clause within the OVER
clause is used to specify the order of rows within each partition. This is particularly useful for functions that depend on the order of rows, such as ROW_NUMBER()
or RANK()
.
Example:
SELECT EmployeeID, Department, Salary, ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS SalaryRank FROM Employees;
In this example, the ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC)
assigns a unique rank to each employee within their department based on their salary.
4. Frame Clause
The frame clause is used to define a subset of rows within the partition. This allows for more granular control over the rows that are included in the window function calculation.
Example:
SELECT EmployeeID, Department, Salary, SUM(Salary) OVER (PARTITION BY Department ORDER BY Salary ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS RollingSum FROM Employees;
Here, the SUM(Salary) OVER (PARTITION BY Department ORDER BY Salary ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
calculates the sum of the current row's salary plus the salaries of the preceding and following rows within the same department.
5. Common Window Functions
There are several common window functions that can be used with the OVER
clause:
ROW_NUMBER()
: Assigns a unique number to each row within a partition.RANK()
: Assigns a rank to each row within a partition, with gaps for ties.DENSE_RANK()
: Assigns a rank to each row within a partition, without gaps for ties.NTILE(n)
: Divides the rows in a partition inton
buckets.LAG(column, n)
: Accesses data from the previous row within a partition.LEAD(column, n)
: Accesses data from the following row within a partition.
Example:
SELECT EmployeeID, Department, Salary, ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS SalaryRank, RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS SalaryRankWithGaps, DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS SalaryRankWithoutGaps, NTILE(4) OVER (PARTITION BY Department ORDER BY Salary DESC) AS SalaryQuartile, LAG(Salary, 1) OVER (PARTITION BY Department ORDER BY Salary DESC) AS PreviousSalary, LEAD(Salary, 1) OVER (PARTITION BY Department ORDER BY Salary DESC) AS NextSalary FROM Employees;
This example demonstrates the use of multiple window functions to analyze salary data within each department.
Understanding and mastering window functions can significantly enhance your ability to perform complex data analysis tasks in SQL, making your queries more efficient and powerful.