The UNION clause in SQL is used to combine the result sets of two or more SELECT statements into a single result set. It removes duplicate rows between the SELECT statements, ensuring that each row in the final result set is unique. This is particularly useful when you want to combine data from multiple tables or queries while eliminating duplicates.

Key Points About UNION

  1. Combining Results: UNION is used to combine rows from two or more tables or queries.
  2. Duplicate Removal: By default, UNION removes duplicate rows from the combined result set.
  3. Column Matching: The number of columns and their data types must match in all the SELECT statements being combined.
  4. Order of Columns: The columns in the SELECT statements must be in the same order.

Syntax of UNION

SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;
  • column1, column2, ...: The columns you want to retrieve.
  • table1, table2: The tables from which you want to retrieve data.

Example of UNION

Suppose you have two tables, Employees and Managers, and you want to retrieve a list of all unique names from both tables.

Table: Employees

Name
Anand
Bala
Kavitha
Raj
Kumar

Table: Managers

Name
Ram
Karthik
David
Anand
Kannan
SELECT Name
FROM Employees
UNION
SELECT Name
FROM Managers;
Name
Anand
Bala
Kavitha
Raj
Kumar
Ram
Karthik
David
Kannan

Notice that the name β€œAnand” appears in both tables, but it is included only once in the result set because UNION removes duplicates.

UNION vs UNION ALL

  • UNION: Removes duplicate rows from the combined result set.
  • UNION ALL: Includes all rows, including duplicates, in the combined result set.

Example of UNION ALL

Using the same Employees and Managers tables:

SELECT Name
FROM Employees
UNION ALL
SELECT Name
FROM Managers;

Result

Name
Anand
Bala
Kavitha
Raj
Kumar
Ram
Karthik
David
Anand
Kannan

Here, the name β€œAnand” appears twice because UNION ALL does not remove duplicates.

When to Use UNION

  • Use UNION when you want to combine results from multiple queries and ensure that the final result set contains only unique rows.
  • Use UNION ALL when you want to combine results and are okay with including duplicate rows.

Important Considerations

  1. Performance: UNION can be slower than UNION ALL because it requires additional processing to remove duplicates.
  2. Column Names: The column names in the final result set are taken from the first SELECT statement.
  3. Ordering Results: If you need to order the final result set, you can use an ORDER BY clause at the end of the last SELECT statement.

Example with ORDER BY

SELECT Name
FROM Employees
UNION
SELECT Name
FROM Managers
ORDER BY Name;

Result

Name
Anand
Bala
David
Kannan
Karthik
Kavitha
Kumar
Raj
Ram

The result is sorted alphabetically by the Name column.

Practical Use Case

Imagine you have two tables, Students2022 and Students2023, and you want to create a list of all unique students who enrolled in either year.

Table: Students2022

Name
Siva
Ramesh
Suresh
Sathish

Table: Students2023

Name
Sujatha
Siva
Ramesh
Kannan
SELECT Name
FROM Students2022
UNION
SELECT Name
FROM Students2023;
Name
Siva
Ramesh
Suresh
Sathish
Sujatha
Kannan

The result contains all unique names from both tables.

Key Takeaways

  1. The UNION clause combines results from multiple SELECT statements into a single result set.
  2. It removes duplicate rows by default. Use UNION ALL to include duplicates.
  3. The number of columns and their data types must match in all SELECT statements.
  4. Use ORDER BY to sort the final result set.
  5. UNION is useful for combining data from multiple sources while ensuring uniqueness.