Back to basics: anti-joins and semi-joins

I almost forgot that the Oracle Joins series is not complete yet. So here is the last episode, a quick and easy anti-join and semi-join refresher.

Anti-joins:
Anti-joins are written using the NOT EXISTS or NOT IN constructs. An anti-join between two tables returns rows from the first table for which there are no corresponding rows in the second table. In other words, it returns rows that fail to match the sub-query on the right side.

Suppose you want a list of departments with no employees. You could write a query like this:

SELECT   d.department_name
    FROM departments d
MINUS
SELECT   d.department_name
    FROM departments d, employees e
   WHERE d.department_id = e.department_id
ORDER BY department_name;

The above query will give the desired results, but it might be clearer to write the query using an anti-join:

SELECT   d.department_name
    FROM departments d
   WHERE NOT EXISTS (SELECT NULL
                       FROM employees e
                      WHERE e.department_id = d.department_id)
ORDER BY d.department_name;

Semi-joins:
Semi-joins are written using the EXISTS or IN constructs. A semi-join between two tables returns rows from the first table where one or more matches are found in the second table. The difference between a semi-join and a conventional join is that rows in the first table will be returned at most once.

Suppose you want a list of departments with at least one employee. You could write the query like this:

SELECT   d.department_name
    FROM departments d, employees e
   WHERE d.department_id = e.department_id
ORDER BY department_name;

The department name in the query result will appear as many times as the number of employees in it. So, for example if a department has 30 employees then that department will appear in the query output 30 times.

To eliminate the duplicate rows, you could use the DISTINCT or GROUP BY keywords. A more elegant solution is to use a semi-join between the departments and employees tables instead of a conventional join:

SELECT   d.department_name
    FROM departments d
   WHERE EXISTS (SELECT NULL
                   FROM employees e
                  WHERE e.department_id = d.department_id)
ORDER BY d.department_name;

The above query will list the departments that have at least one employee. The department will appear only once in the query output no matter how many employees it has.

Sources and Resources:


Possibly related:


Tagged , | Post a Comment