SQL HAVING Clause

The SQL Server HAVING clause is very similar to the WHERE clause because both are helpful to filter table records based on a given expression. As we already know, the WHERE clause does not allow us to check any conditions against the aggregated data. So, to filter grouped data or check conditions against the aggregated data, we must use the HAVING Clause.

The definition goes like the SQL HAVING clause is useful for restricting the number of rows (or records) returned by the GROUP BY statement. It means that to use the HAVING clause, we have to use GROUP BY because HAVING filters data that we get from that set. If we haven’t used grouping, then this HAVING clause behaves like a WHERE, and in real-time, it throws an error.

Moreover, the WHERE Condition is applied to each row before the Server Group By is implemented. However, the SQL Server HAVING clause will apply after the grouping has happened. In general, the HAVING clause can be commonly used with the aggregated functions like AVG(), COUNT(), MIN(), MAX(), SUM(), etc.

For example, if you want to retrieve the employees’ salaries by their department, you can group the employees by their department. Next, use the aggregated SUM() function to find the salaries by department group. However, there are some situations where you need the salaries of a few departments or to find which department is getting the highest salaries. In such cases, you can use the SQL HAVING clause to filter the aggregated result: HAVING SUM(Salaries) > 1000000. This article shows how to write the HAVING Clause to filter the data after the group applies the aggregate function.

SQL HAVING Clause Syntax

The HAVING Clause works along with the GROUP BY Statement, and its syntax is written as shown below:

SELECT [Column1], [Column2]...[ColumnN],
Aggregate Function (Expression | Column_Name)
FROM [Source]
WHERE [Conditions] -- Optional
GROUP BY [Column1], [Column2]...[ColumnN]
HAVING [Conditions] -- This Condition is on Aggregate Function (Expression | Column_Name)
ORDER BY ASC|DESC

In the above syntax, the aggregate functions that the SQL Server HAVING clause supports are SUM(), COUNT(), MIN(), MAX(), AVG(), etc.

  •  [Column1],…[ColumnN]: List of column names that you want to retrieve from the database tables. Remember, these column names must be included within the GROUP BY.
  • Aggregate Function: Here, you have to use any of the available aggregate functions to find the count, minimum, maximum, total, etc.
  • WHERE condition: It is optional, and if you want the rows to filter before grouping, use it.
  • GROUP BY: Please include all the non-aggregated columns here. Otherwise, it will throw an error.
  • HAVING condition: Here, you can use the aggregate function to filter the grouped rows.
  • ORDER BY: It is optional and helps to sort the result in ascending or descending order.

We have already explained the Aggregate Functions, Conditions, and Group By options. So, please refer to Group By before looking into this. Here, the SQL Server HAVING clause helps to provide the Filters or Conditions on the Aggregated Data we got from the Group By.

Let me give you some simple and complex examples of the HAVING clause so that you can understand the working functionality.

SQL HAVING Clause Single Condition Example

Let me show you how to write the HAVING clause with a single condition. And for this, we use the Adventure Works database tables. In this example, we use the MAX and the SUM functions on multiple columns, and within the HAVING clause, we use the SUM() function to filter the select statement data.

SELECT Color, SUM(SalesAmount) AS [Total Sales], 
MAX(SalesAmount) AS [Maximum Sale]
FROM DimProduct pr
JOIN FactInternetSales fact ON fact.ProductKey = pr.ProductKey
GROUP BY Color
HAVING SUM(SalesAmount) > 150000

In the above query, the GROUP BY statement will group the products by color. Next, the SQL HAVING Clause will check the condition of whether the group’s total or sum of the sales amount is greater than 150000. The SELECT statement selects the product Color, Maximum Sale, and the sum of the Sales amount.

Use the HAVING Clause with Multiple Conditions

This example uses the SQL Server HAVING clause with Multiple Conditions combined by using the AND and NOT operators. First, the query will organize the Dim Products by Color. Next, the HAVING clause will check the condition of whether the SUM([SalesAmount]) is greater than 10000 and the sum of the order quantity is not less than 3500.

SELECT Color,
SUM(SalesAmount) AS [Total Sales],
SUM([OrderQuantity]) as [Total Orders]
FROM DimProduct pr
JOIN FactInternetSales fact ON fact.ProductKey = pr.ProductKey
GROUP BY Color
HAVING SUM(SalesAmount) > 10000 AND NOT SUM(OrderQuantity) < 3500

SQL Server HAVING Clause with SUM() Function

As we mentioned earlier, you can use any of the available aggregate functions in the HAVING clause to filter the grouped records. In this series of examples, we first start with the SUM() function HAVING clause with a simple single.

In the below query, first, the Group By will categorize the Dim customers by their English Occupation. Next, the SUM() function will find the sum of all the records in each Occupation group. The third step is the execution of the HAVING clause. Here, it will check the condition whether the sum of the [YearlyIncome] is greater than 200000000.

SELECT [EnglishOccupation]
,SUM([YearlyIncome]) AS [Total Income]
FROM [DimCustomer]
GROUP BY [EnglishOccupation]
HAVING SUM([YearlyIncome]) > 200000000
SQL HAVING clause syntax with GROUP BY example

It is another example where it will find the sum of sales in each customer occupation group. And then, it print the customers whose total sales are greater than 3000000.

SELECT [EnglishOccupation]
,SUM(SalesAmount) AS [Total Sales]
FROM [DimCustomer] JOIN FactInternetSales
ON DimCustomer.CustomerKey = FactInternetSales.CustomerKey
GROUP BY [EnglishOccupation]
HAVING SUM(SalesAmount) > 3000000

Filtering Groups by the COUNT function

This example uses the SQL COUNT function in the HAVING clause to find or count the total number of products in each product color. Next, it displays the color groups whose product count is at least 10.

SELECT Color, COUNT([ProductKey]) AS TotalProducts
FROM [DimProduct]
GROUP BY Color
HAVING COUNT([ProductKey]) > 10

SQL HAVING Clause with MAX() function

The combination of the MAX function and the HAVING clause will find the maximum value in each group. On top of it, you can further filter the groups based on this maximum value.

For example, the below query will group the DimCustomer based on their occupation. And then, it return the maximum yearly income in each occupation. The HAVING clause will filter those EnglishOccupation groups and return the customers whose maximum income is greater than 10000.

SELECT [EnglishOccupation]
,MAX([YearlyIncome]) AS [Highest Income]
FROM [DimCustomer]
GROUP BY [EnglishOccupation]
HAVING MAX([YearlyIncome]) > 10000

Let me show you another example of the SQL HAVING clause with the MAX() and SUM() functions. The SELECT statement selects the Occupation, Education, Sum of yearly Income, and Maximum Sales. Next, it groups the Sim customers by their English Occupation and Education. The HAVING clause will check the condition of whether the MAX([YearlyIncome]) > 80000.

SELECT [EnglishOccupation], EnglishEducation
,SUM([YearlyIncome]) AS [Total Income]
,MAX([YearlyIncome]) AS [Highest Income]
FROM [DimCustomer]
GROUP BY [EnglishOccupation], EnglishEducation
HAVING MAX([YearlyIncome]) > 80000
ORDER BY EnglishOccupation

Using HAVING with the MIN function example

You can also use the aggregate MIN function along with the SQL HAVING clause to apply a minimum or lowest filter on the selected data. The below query will group the Customers by their occupation and display the lowest income in the respective occupation. Next, the HAVING clause will filter those occupation groups by displaying the EnglishOccupation groups whose YearlyIncome is greater than 10000.

SELECT [EnglishOccupation]
,MIN([YearlyIncome]) AS [Lowest Income]
FROM [DimCustomer]
GROUP BY [EnglishOccupation]
HAVING MIN([YearlyIncome]) > 10000

SQL HAVING Clause with the AVG() function

This example uses the AVG aggregate function to find the average income in each customer occupation group. Next, the HAVING clause with the AVG() combination will display the groups whose average occupation is greater than 40000.

SELECT [EnglishOccupation]
,AVG([YearlyIncome]) AS [Average Income]
FROM [DimCustomer]
GROUP BY [EnglishOccupation]
HAVING AVG([YearlyIncome]) > 40000

If you are comfortable with the joins, you can try the code below. This query will find the average sales in each customer occupation. Next, it print the groups whose average sales amount is greater than 450.

SELECT [EnglishOccupation]
,AVG(SalesAmount) AS [Average Sales]
FROM [DimCustomer] JOIN FactInternetSales
ON DimCustomer.CustomerKey = FactInternetSales.CustomerKey
GROUP BY [EnglishOccupation]
HAVING AVG(SalesAmount) > 450
SQL HAVING clause query to filter aggregated data

SQL HAVING with ORDER BY Clause

As you already know, the ORDER BY clause helps to sort the selected data in ascending or descending order. You can use the Order By along with the HAVING clause to sort the filtered grouped data in ASC or DESC.

In this example, we used MAX(), MIN(), and SUM() aggregate functions to find the maximum, minimum, and total income of the customers’ occupation group. Next, the HAVING clause will filter the customer group whose maximum income is greater than the 30000 threshold. Next, the ORDER BY clause will sort the whole result set table by total yearly income in descending order.

SELECT [EnglishOccupation]
,SUM([YearlyIncome]) AS [Total Income]
,MAX([YearlyIncome]) AS [Maximum Income]
,MIN([YearlyIncome]) AS [Minimum Income]
FROM [DimCustomer]
GROUP BY [EnglishOccupation]
HAVING MAX([YearlyIncome]) > 30000
--ORDER BY EnglishOccupation
ORDER BY [Total Income] DESC

Logical AND, OR, NOT Operators to Filter the Data

As we have already mentioned in the above examples, the SQL HAVING clause can be useful with either a single condition or multiple conditions, depending on the requirement. While using multiple conditions, you can use any of the existing logical operators that include AND, OR, and NOT.

  • AND: It returns the records when both conditions return TRUE.
  • OR: The query returns the rows if either of the two conditions is met.
  • NOT: It returns the opposite result.

The following query uses the DimProduct and the FactInternetSales tables to show the logical operators inside the HAVING clause. First, query groups the products by their colors and counts the total number of products in each color, along with their respective sales amounts. Within the SQL HAVING clause, we used two conditions combined using the logical AND operator so that both conditions have to be met to display that particular product color group. To display the records, the sum of the sales amount has to be greater than 10000, and the products count should be less than 9000.

SELECT Color, COUNT(pr.[ProductKey]) AS TotalProducts,
SUM(SalesAmount) AS [Total Sales]
FROM DimProduct pr
JOIN FactInternetSales fact ON fact.ProductKey = pr.ProductKey
GROUP BY Color
HAVING SUM(SalesAmount) > 10000 AND COUNT(pr.[ProductKey]) < 9000
SQL HAVING clause with Multiple Conditions

SQL Server Having Vs. Where

Although both the WHERE and HAVING clauses are helpful to filter the selected data, their approach are quite different from others. For example, the WHERE condition will apply to the individual rows before they are grouped by the GROUP BY clause. Next, the SQL HAVING clause will apply the filters on those grouped rows. The following table lists some of the differences between HAVING and WHERE clauses.

HAVINGWHERE
It is executed after the GROUP BY. It is used after the groups are created, so it is the Post Filter.The WHERE clause is executed before the GROUP BY or groups are created, so it is the Pre Filter.
SQL HAVING clause uses the given condition to filter the aggregated or grouped data. So, instead of a single row, it checks the condition on grouped rows.WHERE clause uses the given condition to filter every single row from the SELECT statement. So, it checks the condition on individual table row instead of grouped rows.
HAVING need a GROUP BY clause to combine or group the same kind of records.AS there is no impact on the output, the WHERE clause can be executed with or without the GROUP BY.
The order of the query execution is first GROUP BY and then HAVING clause.The order of the query execution is first WHERE clause is used before the GROUP BY.
It is used to filter the data using aggregate functions.You cannot use the aggregate function inside the WHERE clause to filter the data.

We can also use the WHERE conditions along with this clause. The next section will cover the same.

Combining WHERE and HAVING clauses

In the previous section, we showed the difference between the HAVING and WHERE clauses, but we can use the combination of both to apply multiple filters. For instance, you need a WHERE clause to apply a particular condition on individual rows to exclude a few rows from grouping. Once the grouping happens, the HAVING clause will apply additional filters to restrict the grouped rows.

This example shows SQL Server Having vs. Where clause. The following query will first extract the customer’s data whose Education is not equal to [Partial High School].

Next, the GROUP BY statement will group the customers by Occupation and Education. Last, the HAVING Clause will check the condition of whether the MAX([YearlyIncome]) > 90000 or not. So, right before the grouping happens, the table data filters: customers whose education is not ‘Partial High School’.

SELECT [EnglishOccupation], EnglishEducation
,SUM([YearlyIncome]) AS [Total Income]
,MAX([YearlyIncome]) AS [Highest Income]
FROM [DimCustomer]
WHERE EnglishEducation <> 'Partial High School'
GROUP BY [EnglishOccupation], EnglishEducation
HAVING MAX([YearlyIncome]) > 90000
ORDER BY EnglishOccupation

To explain the difference, we are commenting on the last line of the query. Please check all the records whose Education is not Partial High School on the bottom left side image.

This time, we are commenting on the WHERE condition. That is why you see the Partial High School in the result window on the bottom right side of the image.

Difference between SQL HAVING and WHERE clause

Nested Subqueries in SQL HAVING clause

The following query is a simple example for demonstrating the nested subqueries within the HAVING clause.

  • The main SELECT statement will group the products by their name and calculate the sum of the sales amount for each product name.
  • The HAVING clause has a SUM(fact.SalesAmount)  condition to check that each product name group’s sales amount is greater than the subquery (average) value.
  • The subquery has a nested query where the inner one finds the sum of the sales amount for each product key group. Next, the subquery finds the average of them.
  • The final result will be the products whose total sales amount is greater than the average sales.
SELECT pr.EnglishProductName, SUM(fact.SalesAmount) AS [Total Sales]
FROM FactResellerSales fact
JOIN DimProduct pr ON fact.ProductKey = pr.ProductKey
GROUP BY pr.EnglishProductName
HAVING SUM(fact.SalesAmount) > (SELECT AVG(Sales) FROM
(SELECT SUM(ft.SalesAmount) AS Sales
FROM FactResellerSales ft GROUP BY ft.ProductKey) AS AvgSales);

It produces the result of 88 rows, and we simply show the first 6 records:

EnglishProductNameTotal Sales
HL Mountain Frame – Black, 38501788.1977
HL Mountain Frame – Black, 42901590.2336
HL Mountain Frame – Silver, 38930780.6807
HL Mountain Frame – Silver, 42269874.0096
HL Mountain Frame – Silver, 46499556.5724
HL Road Frame – Black, 44375026.9772

If you are confused by the multilevel nested subqueries within the HAVING clause, the following example is a very simple example. Here, the SQL HAVING clause uses one subquery, and it finds the average sales of the complete FactInternetSales table.

SELECT pr.EnglishProductName, SUM(fact.SalesAmount) AS [Total Sales]
FROM FactResellerSales fact
JOIN DimProduct pr ON fact.ProductKey = pr.ProductKey
GROUP BY pr.EnglishProductName
HAVING SUM(fact.SalesAmount) > (SELECT AVG(SalesAmount) FROM FactResellerSales)

It produces the result of 244 rows, and we are showing the first 5 rows of the result set:

EnglishProductNameTotal Sales
AWC Logo Cap31541.3461
Bike Wash – Dissolver11188.3725
Cable Lock16225.22
Chain9377.7102
Classic Vest, M77614.101

SQL HAVING Clause Common Errors & Best Practices

The following is a list of some of the common errors that you might face and the best practices to avoid them without any trouble.

  1. When you are using the HAVING clause, the columns that you have selected in the SELECT statement should be placed either in the GROUP BY clause or within the aggregate function. Otherwise, the query will throw an error.
  2. It is always advisable to use the ALIAS column names for the aggregated results. So that the results should be more meaningful and understandable.
  3. The SQL HAVING clause will only work on grouped records (not individual rows), so you must use it with the combination of the GROUP BY clause.
  4. It is important to understand the order of the HAVING clause query execution; otherwise, you may confuse it with the result set. The sequence of execution is: SELECT -> FROM -> WHERE -> GROUP BY -> HAVING -> -> ORDER BY.
  5. While working with large datasets, use the WHERE condition to filter the data before applying the aggregation and grouping to avoid the load.
  6. To perform the numeric comparison, you can use the list of available comparison operations such as =, !=, >, <, etc.
  7. Use the logical operators (AND, NOT, and OR) to combine multiple aggregate functions within the HAVING clause.
  8. Using aggregate functions to filter the data inside the WHERE condition will throw an error, so use the HAVING clause.
Categories SQL

Comments are closed.