The objective of this SQL Server tutorial is to teach you how to use a CORRELATED SUBQUERY in a SQL statement to accept input from the main query for processing and generation of final resultset.
What is CORRELATED SUBQUERY in SQL Server?
There are two types of subqueries – CORRELATED SUBQUERY and UNCORRELATED SUBQUERY (simply put a normal subquery). A correlated subquery accepts input from the main query for processing and hence is dependent upon it. That is why it is called a CORRELATED SUBQUERY i.e. a subquery related to the main query. Without the main query a CORRELATED SUBQUERY does not have a life of its own. A CORRELATED SUBQUERY executes multiple times for every row fetched from the table by the main query. That is why a CORRELATED SUBQUERY is also called a REPEATING SUBQUERY.
The differences between CORRELATED and UNCORRELATED SUBQUERY are listed below.
Pt. No.
Uncorrelated Subquery
Correlated Subquery
1
Subquery executes first and only once
Subquery executes after main query and executes multiple times
2
Subquery provides input to the main query
Subquery accepts input from the main query
3
Subquery resultset value or values are patched to the final resultset
Subquery resultset value or values are used for evaluation or comparison with main query row value or values as per condition specified
4
Subquery resultset value or values form part of the final resultset
Subquery resultset value or values do not form part of the final resultset
5
Efficient and faster processing since subquery executes only once
Inefficient and slower processing since subquery processes repeatedly for every row fetched by main query from the table
Operation
A CORRELATED SUBQUERY can be part of SELECT, UPDATE and DELETE statements. A CORRELATED SUBQUERY can return one or more values and is usually used in conjunction with an aggregate function. The value or values returned is compared or evaluated against the specified outer query row value to keep or reject the outer query row (i.e. record) as a part of the final resultset.
SQL Server CORRELATED SUBQUERY Syntax
The basic syntax of a CORRELATED SUBQUERY can be presented as below.
SELECT column_list
FROM table1 outer_table
WHERE outer_column operator
(select column_list
FROM table2 inner_table
WHERE inner_table.inner_column= outer_table.outer_column);
In this syntax,
column_list – list of table columns.
outer – table on which the main query operates.
outer_column – column belonging to the outer table.
operator – a comparison, logical or mathematical operator.
inner – table on which the subquery operates. It can be the same as outer table or another table.
inner _column – column belonging to the outer table.
1) Example 1: SQL Server CORRELATED SUBQUERY
Let us understand the same with the help of some examples. Suppose we have a table called orders containing the information of orders placed by different customers as below. We will use this table as the sample reference table for our example.
order_id
cust_name
order_date
order_value
210
Michael Smith
6/11/2019
5500
211
Michael Smith
6/11/2019
5100
212
Jackie Collins
6/13/2019
3139
213
Jackie Collins
6/13/2019
3000
214
Howard Spencer
6/13/2019
3550
215
Howard Spencer
6/13/2019
3500
216
Michael Smith
6/22/2019
5000
217
Jackie Collins
6/22/2019
5500
218
Jackie Collins
6/22/2019
3000
Table: orders
Suppose we want to find out the highest value order placed by each customer with the resultset sorted by order value. We can do so through the following query with a CORRELATED SUBQUERY in it.
SELECT cust_name, order_id, order_value
FROM orders o1
WHERE order_value IN
(SELECT MAX (order_value) FROM orders o2
WHERE o2.cust_name = o1.cust_name GROUP BY cust_name)
ORDER BY order_value DESC;
It will generate the following result which fulfils our requirement.
cust_name
order_id
order_value
Michael Smith
210
5500
Jackie Collins
217
5500
Howard Spencer
214
3550
Explanation: Here the outer main query executes first and fetches the mentioned column values from every row of the table one by one and passes it to the subquery. The subquery then runs in its entirety determining the highest value order for each customer with the help of the MAX aggregate function. The outer table row order value is then compared with this value. If outer table row order value matches any of the MAX values then that row is retained otherwise rejected. In this manner the query isolates the highest value customer order records for each customer which constitutes the final resultset.
2) Example 2: SQL Server CORRELATED SUBQUERY
Let us consider another example of an employees table containing the data of company employees. The table is represented below. The table has employees belonging to 3 departments identified by the id’s 221, 222, 223.
emp_id
first_name
last_name
emp_salary
dept_id
1270
Claudia
Crawford
52000
221
1271
Jeff
Goldsmith
50000
221
1272
Nigel
Knight
60000
222
1273
James
Orwell
66000
222
1274
Cindy
Smith
65000
222
1275
Niki
Bailey
55000
223
1276
Mike
Mattis
69000
223
1277
Angelia
Jackson
59000
223
1278
Martha
DeClarke
55000
223
1279
Albert
Hemingway
55000
221
Suppose we want to retrieve the list of employees whose salary is higher than the department average sorted by employee salary. We can do so through the following query using a CORRELATED SUBQUERY in it.
SELECT emp_id, first_name, last_name, emp_salary
FROM employees e1
WHERE emp_salary >
(SELECT AVG (emp_salary)
FROM employees
WHERE dept_id = e1.dept_id)
ORDER BY emp_salary DESC;
It will generate the following result which fulfils our requirement.
emp_id
first_name
last_name
emp_salary
1276
Mike
Mattis
69000
1273
James
Orwell
66000
1274
Cindy
Smith
65000
1279
Albert
Hemingway
55000
Explanation: Just like in the previous example the outer main query executes first and fetches the mentioned column values from every row of the table one by one and passes it to the subquery. The subquery then runs in its entirety determining the average salary for each department with the help of the AVG aggregate function. The outer table row employee salary value is then compared with this value. If outer table employee salary is higher than any of the department values then that row is retained otherwise rejected. In this manner the query isolates the employee records of those employees whose salary above the department average and that constitutes the final resultset.