EXISTS and NOT EXISTS are used with a subquery in the WHERE clause to examine if the result the subquery returns is TRUE or FALSE. It will identify the existence of a relationship without regard for quantity. For example, if the subquery returns at least one record in its result set, the EXISTS clause will evaluate to true and the condition will be met. If the subquery does not return any records, the EXISTS clause will evaluate as false and the condition will not be met. Because both are used with correlated subqueries, the subquery executes once for every row in the outer query. In other words, for each row in the outer query, by using information from the outer query, the subquery checks if it returns TRUE or FALSE, and then the value is returned to the outer query to use.
Notes
- If EXISTS (subquery) returns at least 1 row, the result is TRUE.
- If EXISTS (subquery) returns no rows, the result is FALSE.
- If NOT EXISTS (subquery) returns at least 1 row, the result is FALSE.
- If NOT EXISTS (subquery) returns no rows, the result is TRUE.
- Use EXISTS when you want to find rows that satisfy a condition based on the existence of related rows in another table.
- Use NOT EXISTS when you want to find rows where a related condition is not satisfied.
EXISTS Keyword in Sub-query
SELECT [Column Names] FROM [Source] WHERE EXISTS (Write Sub-query to Check)
Columns: It allows us to choose the number of columns from the tables. It may be One or more.
Source: One or more tables present in the Database. SQL JOINS are used to join multiple tables.
Subquery: Here we have to provide the Subquery. If the subquery returns true then it will return the records otherwise, it doesn’t return any records.
Performance consideration in SQL queries
- Use the smallest subquery possible: When using the EXISTS or NOT EXISTS operator, it’s important to keep the subquery as small and simple as possible. Avoid using complex expressions, aggregations, and group by clause, as they can slow down the query.
- Index the join columns: If you’re joining two tables in your subquery, make sure the join columns are indexed. This will help improve query performance.
- Avoid using NOT EXISTS: This operator can be slow, as it requires the database to evaluate the subquery for every row in the outer query. Whenever possible, try to use the EXISTS operator instead of the NOT EXISTS operator.
- Use IN instead of EXISTS: If you’re looking for specific values in a subquery, it may be more efficient to use the IN operator instead of the EXISTS operator.
- Avoid using correlated subqueries: A correlated subquery is a subquery that references columns from the outer query. Correlated subqueries can be slow, as the database must evaluate the subquery for each row in the outer query.
Input Tables
Table: customers
customer_id | customer_nm | address | city | Zip | country |
1 | Alex Angles | 230 Erwin Street | Woodland Hills | 91367 | USA |
2 | Frank Wilson | New Delhi Road | Delhi | 87653 | India |
3 | Daniel Perker | Hilton street | Paris | 75016 | France |
4 | Mark Punk | Italiano blvd | Berlin | 10101 | Italy |
5 | Thomas Wilson | 1007 N Jackson street | Milwakee | 78881 | USA |
Table: orders
order_id | customer_id | order_date | order_amt | Ship_id |
10001 | 1 | 2021-01-01 | 100.00 | 1 |
10002 | 5 | 2021-06-01 | 150.00 | 10 |
10003 | 3 | 2021-07-01 | 1000.00 | 3 |
10004 | 4 | 2021-07-15 | 250.50 | 2 |
10005 | 3 | 2021-08-01 | 225.50 | 5 |
Using EXISTS Condition with the SELECT Statement
SELECT * FROM customers WHERE EXISTS (SELECT * FROM orders WHERE customers.customer_id = orders.customer_id);
The above query will return all the customer IDs present in the order table. Except for customer id 2 rest of the customer IDs are present in the order table.
customer_id | customer_nm | address | city | Zip | country |
1 | Alex Angles | 230 Erwin Street | Woodland Hills | 91367 | USA |
3 | Daniel Perker | Hilton street | Paris | 75016 | France |
4 | Mark Punk | Italiano blvd | Berlin | 10101 | Italy |
5 | Thomas Wilson | 1007 N Jackson street | Milwakee | 78881 | USA |
Using EXISTS Condition with the UPDATE Statement
UPDATE orders SET order_amt = order_amt + 10 WHERE EXISTS (SELECT customer_id FROM customers WHERE customers.customer_id = order.customer_id);
The above statement will update the order value in the orders table for those customers who are present in the customers table, by increasing the current order value by 10 dollars. The final orders table will be
order_id | customer_id | order_date | order_amt | Ship_id |
10001 | 1 | 2021-01-01 | 110.00 | 1 |
10002 | 5 | 2021-06-01 | 160.00 | 10 |
10003 | 3 | 2021-07-01 | 1010.00 | 3 |
10004 | 4 | 2021-07-15 | 260.50 | 2 |
10005 | 3 | 2021-08-01 | 235.50 | 5 |
Using EXISTS Condition with the DELETE Statement
DELETE FROM orders WHERE EXISTS (SELECT * FROM customers WHERE customers.customer_id = orders.customer_id AND customers.country = 'USA');
The above query will delete the customers who are from USA.
order_id | customer_id | order_date | order_amt | Ship_id |
10002 | 5 | 2021-06-01 | 150.00 | 10 |
10003 | 3 | 2021-07-01 | 1000.00 | 3 |
10004 | 4 | 2021-07-15 | 250.50 | 2 |
NOT EXISTS Keyword in Sub-query
NOT condition can be combined with the EXISTS condition to create a NOT EXISTS condition. The NOT EXISTS phrases look for the non-existence of rows in the sub-query result set:
- If the sub-query matches on no rows, the result is true.
- If the sub-query has rows, the result is false.
SELECT [Column Names] FROM [Source] WHERE NOT EXISTS (Write Subquery to Check)
SELECT * FROM customers WHERE NOT EXISTS (SELECT * FROM orders WHERE customers.customer_id = orders.customer_id);
This above query would return all records from the customers table where there are no records in the orders table for the given customer_id.Only 1 record with customer id 2 is not present in order table.
customer_id | customer_nm | address | city | Zip | country |
2 | Frank Wilson | New Delhi Road | Delhi | 87653 | India |
Examples with sample Data
Table: STORE_SALES STORE_KEY | ORDER_NUMBER | CUST_KEY | SALES_AMT -----------+--------------+------------+----------- 166 | 36008 | 14818 | 150 9 | 188567 | 18222 | 35 45 | 202416 | 30333 | 10 113 | 66017 | 18705 | 200 199 | 111111 | 30999 | 300 166 | 121211 | 48999 | 500 198 | 75716 | 30231 | 199 27 | 150241 | 48353 | 599 148 | 182207 | 14111 | 50 24 | 250295 | 14999 | 799 121 | 251417 | 18999 | 999 121 | 199999 | 30999 | 700 Table: CUST_TABLE     CUST_KEY |  CUST_NAME       | CUST_STATE --------------+--------------------+---------------- 18999 | James J. Goldberg | AL 30999 | Sarah N. McCabe | NJ 48999 | Mark L. Brown | MA 30231 | Sandy N. Fish      | CA        48353 | Mark L. Man       | CA        14111 | Cheryl X. Hills   | TX 18222 | Kim J. Young      | MA 30333 | Sunoj N. Samuel   | CT      14818 | Sam X. Niel       | CA        18705 | Jim J. Okimoto    | CA 14999 | William X. Nielson | MA
1. The following query retrieves the list of all the customers who purchased anything from any of the stores amounting to more than 100 dollars & from CA state.
SELECT CUST_KEY, CUST_NAME, CUST_STATE FROM CUST_TABLE WHERE EXISTS (SELECT 1 FROM STORE_SALES WHERE STORE_SALES.CUST_KEY =CUST_TABLE.CUST_KEY AND SALES_AMT > 100) AND CUST_STATE = 'CA' ORDER BY CUST_KEY;
CUST_KEY | CUST_NAME | CUST_STATE --------------+--------------------+---------------- 14818 | Sam X. Niel | CA 18705 | Jim J. Okimoto | CA 30231 | Sandy N. Fish | CA 48353 | Mark L. Man | CA
2. The following query retrieves the list of all the customers who purchased anything from any of the stores amounting to less than 100 dollars & non-CA state.
SELECT CUST_KEY, CUST_NAME, CUST_STATE FROM CUST_TABLE WHERE NOT EXISTS (SELECT 1 FROM STORE_SALES WHERE STORE_SALES.CUST_KEY =CUST_TABLE.CUST_KEY AND SALES_AMT > 100) AND CUST_STATE <> 'CA' ORDER BY CUST_KEY;
CUST_KEY | CUST_NAME | CUST_STATE --------------+--------------------+---------------- 14111 | Cheryl X. Hills | TX 18222 | Kim J. Young | MA 30333 | Sunoj N. Samuel | CT
Here’s a complex example with explanations on EXISTS and NOT EXISTS subqueries:
Scenario:
Imagine a database containing tables for employees, departments, projects, and employee assignments to projects. We need to answer the following questions:
- Which employees have worked on projects in multiple departments?
- List departments with no active projects.
Using EXISTS to identify employees with projects in multiple departments:
SELECT EmployeeID, EmployeeName FROM Employees E WHERE EXISTS ( SELECT * FROM Projects P JOIN Departments D ON P.DepartmentID = D.DepartmentID WHERE E.EmployeeID = P.EmployeeID GROUP BY P.DepartmentID HAVING COUNT(DISTINCT P.ProjectID) >= 2 -- Ensure multiple projects );
Explanation:
- The outer query selects employee details.
- The EXISTS subquery checks for each employee:
- If there are multiple projects (COUNT(DISTINCT P.ProjectID) >= 2) in different departments (GROUP BY P.DepartmentID) associated with that employee.
- If the subquery returns true for an employee (meaning multiple projects across departments exist), that employee’s details are included in the final result.
Using NOT EXISTS to list departments with no active projects:
SELECT DepartmentID, DepartmentName FROM Departments D WHERE NOT EXISTS ( SELECT * FROM Projects P WHERE P.DepartmentID = D.DepartmentID AND P.ProjectStatus = 'Active' -- Filter for active projects );
Explanation:
- The outer query selects department details.
- The NOT EXISTS subquery checks for each department:
- If there are no active projects (P.ProjectStatus = ‘Active’) associated with that department.
- If the subquery returns true for a department (meaning no active projects exist), that department’s details are included in the final result.