The SQL LAG is one of the Analytic functions, which is exactly opposite to LEAD. This lag function allows you to access the data from a previous row without using any SELF JOIN. The basic syntax of the LAG is as shown below:
SELECT LAG([Scalar Expression], [Offset], [Default]) OVER ( PARTITION_BY_Clause ORDER_BY_Clause ) FROM [Source]
- Scalar Expression: It can be a Column, expression, or Subquery that returns a single value.
- Offset: Please define the number of rows you want to move backward. For example, If it is 3, then it will select the 3rd previous row (-3) as a result — Use Column, Subquery, or any expression that returns a single value.
- Default: You can define the default value. If you ignore, NULL is returned.
- Order_By_Clause: This is used to sort the Partitioned data. Please refer to Order By Clause in SQL Server.
- Partition_By_Clause: It separates the Selected records into partitions.
- If you specified the Partition By Clause, the SQL Server LAG function starts picking the previous rows in each partition.
- If you haven’t specified the Partition By, it will consider all rows as a single partition.
For this LAG demo, we are going to use the below-shown data.
SQL Server LAG without Partition By Clause
What happens when we ignore the Partition By Clause in the LAG function?
SELECT [FirstName] ,[LastName] ,[Education] ,[Occupation] ,[YearlyIncome] ,[Sales] ,LAG([Sales]) OVER (ORDER BY [YearlyIncome]) AS CompareSale FROM [Employee]
As you can see, the lag function returned NULL as output for the 1st record. It’s because there is no previous row for that record (it is the first record indeed)
This statement sorts the Employee records in Ascending order based on their Yearly Income
ORDER BY [YearlyIncome]
Next, this function will return the previous value of the row (record before) as the output. If there are no rows to return, the SQL Server Lag function will return a NULL value because we haven’t set any default value.
LAG([Sales]) OVER (ORDER BY [YearlyIncome]) AS CompareSale
Let me define the Offset value =1, and the default value = 0
SELECT [FirstName] ,[LastName] ,[Education] ,[Occupation] ,[YearlyIncome] ,[Sales] ,LAG([Sales], 1, 0) OVER (ORDER BY [YearlyIncome]) AS CompareSale FROM [Employee]
SQL Server LAG with Partition By Example
How to write the previous values from the partitioned records? The following Lag Query partition the data by Occupation using their yearly Income. And then, the lag function returns previous Sales values in each partition.
SELECT [FirstName] + ' ' + [LastName] AS [Full Name] ,[LastName] ,[Education] ,[Occupation] ,[YearlyIncome] ,[Sales] ,LAG([Sales], 1, 0) OVER ( PARTITION BY [Occupation] ORDER BY [YearlyIncome] ) AS SaleCompare FROM [Employee]
If you see, for the 1st, 5th, 9th, and 13th records, the Lag function assigned 0 because there are no previous records within that partition.
LAG with Offset Value
In the above statement, we used it with Partition by clause. It will select the previous rows with the individual partition.
Let us see what happen when we change the SQL LAG Function offset value from 1 to 2 and the default value to 100.
SELECT [FirstName] + ' ' + [LastName] AS [Full Name] ,[Education] ,[Occupation] ,[YearlyIncome] ,[Sales] ,LAG([Sales], 2, 100) OVER ( PARTITION BY [Occupation] ORDER BY [YearlyIncome] ) AS SaleCompare FROM [Employee]
This function is selecting the -2nd row (2 rows ahead), rather than choosing the previous row. If you modify the value to 3, then it will select the last third row (-3rd row) by jumping the last two rows, etc.