Self Join in SQL: A Step-by-Step Guide

Self Join in SQL: A Step-by-Step Guide

28 Oct 2024
Beginner
506 Views
14 min read
Learn with an interactive course and practical hands-on labs

SQL Server Course

SELF JOIN in SQL

SELF JOIN in SQL is an approach to merge a table with itself. Comparing rows inside the same table or arranging the data in the table so that it makes sense might be helpful. By combining a table with itself and utilizing distinct aliases to distinguish between numerous instances, you may get relevant data in SQL with a SELF JOIN.

In this SQL Server Tutorial, you will understand the SELF JOIN in SQL, including when should you use SELF JOINs in SQL, Syntax for the SELF JOIN in SQL, SELF JOIN example, advantages of SELF JOIN in SQL, common use cases for SELF JOIN in SQL, and many more.

What is SELF JOIN in SQL?

SELF JOIN in SQLis simply a regular type of JOIN in SQL where a table is joined with itself. The same table is referenced twice in the query with different aliases, which allows for comparing rows within that table or organizing hierarchical data. SELF JOIN in SQL can be beneficial when you want to relate rows in a single table, such as finding relationships between employees and their managers or identifying parent-child structures.

What is SELF JOIN in SQL

When should you use a SELF JOIN in SQL?

There are some important situations where you should use SELF JOIN in SQL where:

  • You need to compare rows in a table.
  • You have hierarchical data in a single table, such as an employee reporting structure.
  • There is a parent-child relationship within the same dataset.

How does SELF JOIN work in SQL?

In a SELF JOIN in SQL, a table is treated as if it were two separate tables. You use aliases to refer to these two different instances of the same table, then perform the join based on the relationship between columns.

Syntax for a SELF JOIN in SQL

SELECT a.column1, b.column2
FROM table_name a
JOIN table_name b
ON a.common_column = b.common_column;  

Here, a and b are aliases for the same table, and the join condition is defined in the ON clause.

Example Scenario

Consider having an Employee table with data on workers and their supervisors. Every staff member possesses a Manager_ID, which links to their manager's Employee_ID.

Here is the structure of the Employee table:

Employee_IDEmployee_NameManager_ID
1AnkitNULL
2Rakesh1
3Shalini1
4Manish2
5Priya2

Example

To find out the name of each employee along with their manager’s name, we can use a Self JOIN:

SELECT e1.Employee_Name AS Employee, e2.Employee_Name AS Manager
FROM Employee e1
LEFT JOIN Employee e2
ON e1.Manager_ID = e2.Employee_ID;    

Explanation:

  • We use e1 and e2as aliases for the same Employee table.
  • The join condition e1.Manager_ID = e2.Employee_ID ensures that the employee’s Manager_ID matches the Employee_ID of their manager.
  • The LEFT JOIN is used to ensure that employees without a manager (like Ankit, who is at the top of the hierarchy) still appear in the result.

Result:

EmployeeManager
AnkitNULL
RakeshAnkit
ShaliniAnkit
ManishRakesh
PriyaRakesh

Self Join with ORDER BY Clause

The ORDER BY clause in a SELF JOIN in SQL sorts the result based on one or more columns from either of the joined instances of the same table. It helps arrange the result set in ascending or descending order.

Syntax of Self Join with ORDER BY Clause

SELECT a.column1, b.column2
FROM table_name a
JOIN table_name b
ON a.common_column = b.common_column
ORDER BY a.column1 ASC, b.column2 DESC;   

Example

Consider the Employee table and sort the results according to the manager's name in descending order and the employee's name in ascending order.

SELECT e1.Employee_Name AS Employee, e2.Employee_Name AS Manager
FROM Employee e1
LEFT JOIN Employee e2
ON e1.Manager_ID = e2.Employee_ID
ORDER BY e1.Employee_Name ASC, e2.Employee_Name DESC;   

Result

EmployeeManager
AnkitNULL
ManishRakesh
PriyaRakesh
RakeshAnkit
ShaliniAnkit

Some Other Examples of SELF JOIN in SQL

1. Compare Columns In The Same Table

Suppose we have a table Employees with employee details like EmployeeID, Name, and ManagerID, and we want to compare an employee with their manager to get both of their names.

EmployeeIDNameManagerID
1AnkitNULL
2Ravi1
3Sneha1
4Priya2

Query:

SELECT e1.Name AS Employee, e2.Name AS Manager
 FROM Employees e1
JOIN Employees e2 ON e1.ManagerID = e2.EmployeeID;

2. Identify Duplicate Rows

Assume we have a table Students where we suspect some rows may be duplicated based on names.

StudentIDNameClass
1Amit5
2Sumit5
3Amit5

Query:

SELECT s1.StudentID, s1.Name, s1.Class
 FROM Students s1
JOIN Students s2 ON s1.Name = s2.Name AND s1.Class = s2.Class
WHERE s1.StudentID > s2.StudentID;

3. Hierarchical Data

Let's say we have a Departments table where each department has a parent department, and we want to display both the department and its parent.

DeptIDDeptNameParentDeptID
1CorporateNULL
2HR1
3IT1
4Recruitment2

Query

 SELECT d1.DeptName AS Department, d2.DeptName AS ParentDepartment
 FROM Departments d1
 LEFT JOIN Departments d2 ON d1.ParentDeptID = d2.DeptID;

4. Comparing Historical Snapshot Data

Suppose we have a Sales table containing monthly sales data for different products, and we want to compare the sales for a product across two different months.

ProductIDMonthSales
1Jan100
1Feb150
2Jan200
2Feb220

Query:

SELECT s1.ProductID, s1.Month AS Month1, s1.Sales AS Sales1,
 s2.Month AS Month2, s2.Sales AS Sales2
 FROM Sales s1
JOIN Sales s2 ON s1.ProductID = s2.ProductID
WHERE s1.Month = 'Jan' AND s2.Month = 'Feb';

Advantages of SELF JOIN in SQL

  • Hierarchical Data Handling: SELF JOINs are particularly useful in dealing with hierarchical or recursive data, like organizational charts where employees report to managers.
  • Flexible Queries: You can compare different rows within the same table, enabling complex queries without needing multiple tables.
  • Simpler Database Structure: Instead of creating multiple tables for hierarchical or related data, you can use a SELF JOIN in SQL to manage everything within one table.

Common Use Cases for SELF JOIN in SQL

SELF JOINs in SQL can be used for various purposes, such as:

  • Finding Duplicate Records: By connecting a table based on matching columns, you may perform a SELF JOIN to identify duplicate records.
  • Comparing Rows: If you need to compare rows inside a table, for example, to see whether employees are paid more than others, a SELF JOIN may be helpful.

For example, if the Employee table has a Salary column and you need to identify which employees make more than others, you may write:

SELECT e1.Employee_Name, e1.Salary, e2.Employee_Name, e2.Salary
FROM Employee e1
JOIN Employee e2
ON e1.Salary > e2.Salary;  

This query compares employees' salaries within the same table and retrieves those who earn more than others.

SELF JOIN Comparison with Other JOINS

Here is a quick comparison between SELF JOIN and other joins:

Join TypeCompared with Self-Join
Self-Join It compares rows within the same table and is useful for hierarchical data or finding relationships within a single dataset.
INNER JOIN It Compares rows between two different tables, returning only the matching rows based on a condition.
LEFT JOIN It retrieves all rows from the left table and matching rows from the right table (or NULL if no match), using two tables.
RIGHT JOIN It retrieves all rows from the right table and matching rows from the left table (or NULL if no match), using two tables.
FULL OUTER JOIN It returns all rows from both tables, showing matches and non-matches, whereas a self-join compares only within one table.
CROSS JOIN It produces a Cartesian product of two tables, returning all combinations of rows, unlike self-join, which is conditional.

Read More: Top 50 SQL Server Interview Questions and Answers

Conclusion

SELF JOINs in SQL are a powerful technique for comparing data within the same table or working with hierarchical relationships. By using aliases and carefully crafting your join condition, you can unlock valuable insights from your data without needing separate tables. It’s particularly helpful for organizational structures, finding relationships within a dataset, or comparing rows. To master database management with our SQL Server Certification, take the next step in your career by enrolling in the ScholarHat SQL Server Course.

FAQs

A Self Join is used when a table has a relationship with itself, such as when you want to compare rows within the same table or establish relationships between rows in the same table. Common use cases include hierarchical data or finding pairs of related data in the same table. 

While most joins combine rows from two different tables, a Self Join works with only one table by creating two aliases for the same table and treating them as if they are two different tables. 

Since the same table is being used twice, you need to use table aliases to differentiate between the instances of the table. For example, in the query above, e1 and e2 are aliases for the same employees table. 

You should avoid using Self Join if it leads to performance issues on large datasets. In such cases, it might be better to restructure your query or database design, especially if you're dealing with hierarchical data that could benefit from more advanced techniques like recursive CTEs. 

No, a Self Join can be of any join type—INNER JOIN, LEFT JOIN, RIGHT JOIN, etc., depending on your requirement. 

A Self Join can affect performance, especially if the table is large or if there are no proper indexes. Always ensure your table is optimized with indexes for the columns involved in the join condition. 

Take our Sqlserver skill challenge to evaluate yourself!

In less than 5 minutes, with our skill challenge, you can identify your knowledge gaps and strengths in a given skill.

GET FREE CHALLENGE

Share Article
About Author
Shailendra Chauhan (Microsoft MVP, Founder & CEO at ScholarHat)

Shailendra Chauhan, Founder and CEO of ScholarHat by DotNetTricks, is a renowned expert in System Design, Software Architecture, Azure Cloud, .NET, Angular, React, Node.js, Microservices, DevOps, and Cross-Platform Mobile App Development. His skill set extends into emerging fields like Data Science, Python, Azure AI/ML, and Generative AI, making him a well-rounded expert who bridges traditional development frameworks with cutting-edge advancements. Recognized as a Microsoft Most Valuable Professional (MVP) for an impressive 9 consecutive years (2016–2024), he has consistently demonstrated excellence in delivering impactful solutions and inspiring learners.

Shailendra’s unique, hands-on training programs and bestselling books have empowered thousands of professionals to excel in their careers and crack tough interviews. A visionary leader, he continues to revolutionize technology education with his innovative approach.
Accept cookies & close this