DB2

INNER JOIN – step by step walkthrough with examples

There are four basic types of JOINS – Inner, Outer (left, right, full), Self and Cross join. An inner join combines records from two tables based on a join predicate and requires that each record in the first table has a matching record in the second table. Thus, it returns only records from both joined tables that satisfy the join condition. Records that contain no matches are excluded from the result set.This operation helps in merging rows from different tables based on a specified condition, offering a powerful mechanism for data analysis and retrieval.

SELECT columns
FROM Left Table
INNER JOIN Right Table 
ON Left Table.column1 = Right Table.column2;

Let’s break this down:

  • SELECT columns: Choose the specific data you want to retrieve from the joined tables.
  • FROM Left Table, Right Table: Specify the tables you want to join.
  • INNER JOIN: This keyword declares the type of join.
  • ON Left Table.column1 = Right Table.column2: This defines the join condition, specifying the shared element linking the tables.

Inner Join clause is the same as Join clause and works the same way if we don’t specify the type (INNER) while using the Join clause. In short, it is the default keyword for Join and both can be used interchangeably.

There are two types of operations involved in the joins

Logical operations correspond to the joins types used in a query: INNER, OUTER, or CROSS.

Physical operations – the Query Optimizer chooses the best i.e. fastest to produce results physical operation applicable for the join. The Execution Plan of your query will show the physical join operators chosen. These operations are:

  • Nested Loop Join. This operation is fast if one of the two tables is small and the second one is large and indexed. It requires the least I/O with the fewest comparisons, but it’s not good for large result sets.
  • Merge Join. This is the fastest operation for large and sorted result sets by columns used in the join.
  • Hash Join. The query optimizer uses it when the result set is too large for a Nested Loop, and inputs are unsorted for a Merge join. A hash is more efficient than sorting it first and applying a Merge join.
  • Adaptive Join. it enables the choice between a nested loop or hash. The join method is deferred until the first input is scanned. This operation dynamically switches to a better physical join without recompiling.
FeatureINNER JOINStandard JOIN (CROSS JOIN)
Join ConditionExplicitly specified using ON clause.Implicitly combines all rows from both tables.
Result RowsOnly rows where the join condition is met.All possible combinations of rows from both tables.
PurposeCombining tables with shared data.Creating a Cartesian product of all rows.
Common UsesMerging customer and order data.Testing scenarios with multiple variables.
ExampleSELECT * FROM customers INNER JOIN orders ON customers.customerID = orders.customerIDSELECT * FROM customers, orders

Key Differences:

  • Join Condition: INNER JOIN requires an explicit condition to match rows, while Standard JOIN implicitly combines all rows.
  • Resulting Data: INNER JOIN returns only matching rows, while Standard JOIN returns all possible combinations, often leading to a much larger result set.
  • Purpose: INNER JOIN is used for combining related data, while Standard JOIN is often used for generating data sets or testing scenarios.
Input Data
 EmployeeName Table-(T1)Position Table-(T2)         
  ID     NAME           ID     TITLE         TEAM    
  10     Sandy          20     Sales Mgr        5
  20     Sam            30     Clerk           10 
  30     Cindy          30     Manager          2
                        40     Sales Rep        7
                        50     Sr. Manager.    11
 SELECT T1.NAME, 
        T1.ID, 
        T2.ID, 
        T2.TITLE
 FROM EmployeeName T1
 INNER JOIN Position T2 
 ON T1.ID = T2.ID
 ORDER BY T1.ID
        , T2.TITLE;

The above query can be represented by a standard join as given below

 SELECT T1.NAME, 
        T1.ID, 
        T2.ID, 
        T2.TITLE
 FROM EmployeeName T1,
      Position T2 
 WHERE T1.ID = T2.ID
 ORDER BY T1.ID
        , T2.TITLE;
Result         
 ID   NAME     ID     TITLE     
 20   Sam      20     Sales Mgr
 30   Cindy    30     Clerk
 30   Cindy    30     Manager

Below is the result set of the above SQL query. For each row in the table EmployeeName Table, Inner Join compares and finds the matching rows in the table Position Table and returns all the matching rows as shown below. And if you notice, ID values 10, 40 & 50 is excluded from the query result, as it does not make a match in the Position Table.

With the help of the above result set, we can make out the employee who doesn’t have a position & the positions which are not filled. For instance, Sandy has not held any position & Sales Rep, Sr. Manager positions are not filled.

We can filter records based on a specified condition when Inner Join is used with a WHERE clause. Assume that we would like to get the rows where Title is not ’Manager’. In the following query, the WHERE clause is added to extract results with Title ‘Manager’.

SELECT T1.NAME, 
        T1.ID, 
        T2.ID, 
        T2.TITLE       
 FROM EmployeeName T1 
 INNER JOIN Position T2 
 ON T1.ID = T2.ID    
 WHERE T2.TITLE <> ’Manager’
 ORDER BY T1.ID
        , T2.TITLE;
Result
 ID    NAME    ID     TITLE
 20    Sam     20     Sales Mgr
 30    Cindy   30     Clerk
FeatureON ClauseWHERE Clause
PurposeSpecifies the join condition between tablesFilters rows from the resulting joined table
LocationPart of the JOIN statementPart of the SELECT statement
Execution OrderExecuted before the WHERE clauseExecuted after the JOIN clause
ApplicabilityUsed with JOIN statementsUsed with SELECT, UPDATE, and DELETE statements
Result RowsAffects which rows are included in the joinAffects which rows are returned from the already joined table
ExampleSELECT * FROM customers INNER JOIN orders ON customers.customerID = orders.customerIDSELECT * FROM customers WHERE country = ‘USA’

Key Differences:

  • Purpose: ON defines how tables are connected, while WHERE filters the final result set.
  • Execution Order: ON is executed first to create the joined table, then WHERE filters it.
  • Applicability: ON is specifically for JOINs, while WHERE has broader use.
  • Result Rows: ON determines which rows are included in the join, while WHERE determines which rows are returned from the joined table.

It permits us to use Group by clause along with aggregate functions to group the result set by one or more columns. Group by works conventionally with Inner Join on the final result returned after joining two or more tables.

 SELECT T1.ID,
        T1.NAME, 
    SUM(T1.TEAM) AS EMPCOUNT        
 FROM EmployeeName T1          
 INNER JOIN Position T2  
 ON T1.ID = T2.ID    
 GROUP BY T1.ID
 ORDER BY T1.ID
        , T2.TITLE;
Result 
Result without Group By clause
 ID   NAME     EMPCOUNT
 20   Sam             5
 30   Cindy          10
 30   Cindy           2
 
Result with Group By clause        
 ID   NAME     EMPCOUNT
 20   Sam             5
 30   Cindy          12

Scenario: Analyze employee performance and satisfaction across multiple departments, considering their managers and assigned projects.

Tables:

  • employees (employeeID, name, departmentID, managerID)
  • departments (departmentID, name)
  • projects (projectID, name, managerID)
  • performance_reviews (employeeID, rating)
  • satisfaction_surveys (employeeID, satisfaction_score)
SELECT
    employees.name AS employee_name,
    departments.name AS department,
    projects.name AS project,
    performance_reviews.rating,
    satisfaction_surveys.satisfaction_score
FROM employees
INNER JOIN departments ON employees.departmentID = departments.departmentID
INNER JOIN projects ON employees.employeeID = projects.managerID
INNER JOIN performance_reviews ON employees.employeeID = performance_reviews.employeeID
INNER JOIN satisfaction_surveys ON employees.employeeID = satisfaction_surveys.employeeID
WHERE departments.name = 'Sales'; -- Input for specific department

Scenario: Identify top-selling products across different stores, considering their suppliers and stock levels.

Tables:

  • products (productID, name, supplierID)
  • suppliers (supplierID, name)
  • stores (storeID, name)
  • sales (productID, storeID, quantity_sold)
  • inventory (productID, storeID, quantity_in_stock)
SELECT
    products.name AS product_name,
    suppliers.name AS supplier,
    stores.name AS store,
    SUM(sales.quantity_sold) AS total_sales,
    SUM(inventory.quantity_in_stock) AS current_stock
FROM products
INNER JOIN suppliers ON products.supplierID = suppliers.supplierID
INNER JOIN sales ON products.productID = sales.productID
INNER JOIN stores ON sales.storeID = stores.storeID
INNER JOIN inventory ON products.productID = inventory.productID AND stores.storeID = inventory.storeID
WHERE suppliers.name = 'Acme Inc.'; -- Input for specific supplier
GROUP BY product_name, supplier, store
ORDER BY total_sales DESC

INNER JOIN is a powerful SQL operation for combining data from multiple tables based on specified conditions. It forms the backbone of relational databases, enabling efficient data retrieval and analysis. Mastering INNER JOIN is crucial for anyone working with databases and SQL queries. Mastering INNER JOINs empowers you to delve deeper into your database, uncovering valuable insights and forging connections that bring your data to life. So, unleash your inner data detective and start querying with confidence!

OUTER JOINs: Click Here IBM DB2 Manual :Click Here

Admin

Share
Published by
Admin
Tags: Inner Join

Recent Posts

Increase Transparency and Collaboration Product Backlog

A well-maintained product backlog is crucial for successful product development. It serves as a single…

5 days ago

Product Backlog – Incremental value to the customer

Incremental value to the customer refers to the gradual delivery of small, functional parts of…

7 days ago

Product Market, Customer’s Desire, Need, and Challenges

A Product Market refers to the group of potential customers who might be interested in…

1 week ago

PAL-EBM Professional Agile Leadership – EBM Certification

The Professional Agile Leadership - Evidence-Based Management (PAL-EBM) certification offered by Scrum.org is designed for…

3 months ago

PAL I Professional Agile Leadership Certification

The Professional Agile Leadership (PAL I) certification, offered by Scrum.org, is designed to equip leaders…

4 months ago

Scrum Master Certification: CSM, PSM, SSM

Choosing the right Scrum Master Certification depends on your current experience and career goals. If…

5 months ago