Subqueries in SQL
A subquery, also known as a nested query or inner query, is a query within another SQL query. It is used to retrieve data that will be used in the main query as a condition to further restrict the data to be retrieved. Subqueries can be used in SELECT
, INSERT
, UPDATE
, and DELETE
statements, as well as in clauses like WHERE
, FROM
, and HAVING
.
Syntax
Types of Subqueries
- Single-row subquery: Returns only one row.
- Multi-row subquery: Returns more than one row.
- Correlated subquery: A subquery that depends on the outer query for its values.
Single-Row Subqueries
A single-row subquery returns only one row of data. It is often used with comparison operators like =
, >
, <
, >=
, <=
, and <>
.
Example: Find the employee with the highest salary
The subquery (SELECT MAX(salary) FROM employees)
returns the highest salary from the employees
table. The main query then uses this value to find the employee(s) with that salary.
Multi-Row Subqueries
A multi-row subquery returns more than one row of data. It is often used with operators like IN
, ANY
, ALL
, and EXISTS
.
Example: Find employees who work in the same district as Anand
The subquery (SELECT district FROM employees WHERE name = 'Anand')
returns the district where Anand works. The main query then finds all employees who work in that district.
Correlated Subqueries
A correlated subquery is a subquery that depends on the outer query for its values. It is executed repeatedly, once for each row that is evaluated by the outer query.
Example: Find employees whose salary is greater than the average salary in their district
The subquery (SELECT AVG(salary) FROM employees e2 WHERE e2.district = e1.district)
calculates the average salary for each district. The main query then compares each employee’s salary to the average salary in their district.
Practical Use Cases
- Filtering Data: Subqueries can be used to filter data based on conditions that are derived from other tables or the same table.
- Data Aggregation: Subqueries can be used to perform calculations or aggregations that are then used in the main query.
- Data Comparison: Subqueries can be used to compare data between different tables or within the same table.
Example Database
Let’s assume we have an employees
table with the following structure:
Sample Data:
id | name | salary | district | state |
---|---|---|---|---|
1 | Anand | 50000 | Karaikal | Puducherry |
2 | Bala | 60000 | Coimbatore | Tamil Nadu |
3 | Kavitha | 55000 | Chennai | Tamil Nadu |
4 | Raj | 70000 | Cuddalore | Tamil Nadu |
5 | Kumar | 45000 | Madurai | Tamil Nadu |
6 | Ram | 80000 | Salem | Tamil Nadu |
7 | Karthik | 90000 | Trichy | Tamil Nadu |
8 | David | 40000 | Tirupati | Andhra Pradesh |
9 | Kannan | 75000 | Bangalore | Karnataka |
10 | Siva | 65000 | Kannur | Kerala |
11 | Ramesh | 85000 | Wayanad | Kerala |
12 | Suresh | 95000 | Palakkad | Kerala |
13 | Sathish | 50000 | Idukki | Kerala |
14 | Sujatha | 60000 | Karaikal | Puducherry |
Example Queries Using Subqueries
1. Single-Row Subquery
Find the employee with the highest salary
Output:
name | salary |
---|---|
Suresh | 95000 |
The subquery (SELECT MAX(salary) FROM employees)
finds the highest salary in the employees
table, which is 95000
. The main query then retrieves the employee(s) with that salary.
2. Multi-Row Subquery
Find employees who work in the same district as Anand
Output:
name | district |
---|---|
Anand | Karaikal |
Sujatha | Karaikal |
The subquery (SELECT district FROM employees WHERE name = 'Anand')
returns Karaikal
, the district where Anand works. The main query then retrieves all employees who work in Karaikal
.
3. Multi-Row Subquery with IN
Find employees who work in districts located in Tamil Nadu
Output:
name | district |
---|---|
Bala | Coimbatore |
Kavitha | Chennai |
Raj | Cuddalore |
Kumar | Madurai |
Ram | Salem |
Karthik | Trichy |
The subquery (SELECT district FROM employees WHERE state = 'Tamil Nadu')
returns all districts in Tamil Nadu. The main query then retrieves employees who work in those districts.
4. Correlated Subquery
Find employees whose salary is greater than the average salary in their district
Output:
name | salary | district |
---|---|---|
Bala | 60000 | Coimbatore |
Kavitha | 55000 | Chennai |
Raj | 70000 | Cuddalore |
Ram | 80000 | Salem |
Karthik | 90000 | Trichy |
Kannan | 75000 | Bangalore |
Ramesh | 85000 | Wayanad |
Suresh | 95000 | Palakkad |
The subquery (SELECT AVG(salary) FROM employees e2 WHERE e2.district = e1.district)
calculates the average salary for each district. The main query compares each employee’s salary to the average salary in their district and retrieves those who earn more.
5. Subquery in SELECT
Display each employee’s salary along with the average salary of their state
Output:
name | salary | state | avg_state_salary |
---|---|---|---|
Anand | 50000 | Puducherry | 55000.00 |
Bala | 60000 | Tamil Nadu | 65000.00 |
Kavitha | 55000 | Tamil Nadu | 65000.00 |
Raj | 70000 | Tamil Nadu | 65000.00 |
Kumar | 45000 | Tamil Nadu | 65000.00 |
Ram | 80000 | Tamil Nadu | 65000.00 |
Karthik | 90000 | Tamil Nadu | 65000.00 |
David | 40000 | Andhra Pradesh | 40000.00 |
Kannan | 75000 | Karnataka | 75000.00 |
Siva | 65000 | Kerala | 76250.00 |
Ramesh | 85000 | Kerala | 76250.00 |
Suresh | 95000 | Kerala | 76250.00 |
Sathish | 50000 | Kerala | 76250.00 |
Sujatha | 60000 | Puducherry | 55000.00 |
The subquery (SELECT AVG(salary) FROM employees e2 WHERE e2.state = e1.state)
calculates the average salary for each state. This value is displayed alongside each employee’s details.
6. Subquery in HAVING
Find districts where the average salary is greater than the overall average salary
Output:
district | avg_district_salary |
---|---|
Bangalore | 75000.00 |
Palakkad | 95000.00 |
Salem | 80000.00 |
Trichy | 90000.00 |
Wayanad | 85000.00 |
The subquery (SELECT AVG(salary) FROM employees)
calculates the overall average salary. The main query groups employees by district and retrieves districts where the average salary is greater than the overall average.
7. Subquery in FROM
Subquery in FROM
clause is called as Derived table in SQL Server.
Find the top 3 highest-paid employees
Output:
name | salary |
---|---|
Suresh | 95000 |
Karthik | 90000 |
Ramesh | 85000 |
The subquery (SELECT name, salary FROM employees ORDER BY salary DESC)
retrieves all employees sorted by salary in descending order. The main query then limits the result to the top 3 rows.
Key Takeaways
- Subqueries are queries nested inside another query and can be used in
SELECT
,INSERT
,UPDATE
, andDELETE
statements. - Single-row subqueries are useful for comparisons with a single value (e.g., finding the highest salary).
- Multi-row subqueries are ideal for filtering data based on multiple values (e.g., finding employees in specific districts).
- Correlated subqueries allow you to compare data within the same table (e.g., comparing salaries within districts).
- Always test subqueries independently to ensure they return the expected results before integrating them into the main query.