MySQL Joins, UNION, GROUP BY, HAVING and Advanced SQL

A practical guide to advanced MySQL query writing, covering joins, UNION, GROUP BY, HAVING, EXISTS, ANY, ALL, INSERT SELECT, CASE, NULL functions, stored procedures, comments, and operators.

Jun 11, 2026
MySQL Joins, UNION, GROUP BY, HAVING and Advanced SQL

MySQL Joins, UNION, GROUP BY, HAVING and Advanced SQL

MySQL is not only about selecting rows from a single table. Real database work usually requires combining related tables, filtering grouped results, comparing values against subqueries, copying data between tables, handling NULL values correctly, and writing reusable database logic.

This article explains the advanced MySQL topics shown in the learning path: joins, INNER JOIN, LEFT JOIN, RIGHT JOIN, CROSS JOIN, self join, UNION, UNION ALL, GROUP BY, HAVING, EXISTS, ANY, ALL, INSERT SELECT, CASE, NULL functions, stored procedures, comments, and operators.

The goal is not only to memorize syntax. The goal is to understand when each feature should be used, how it behaves, and what mistakes developers should avoid when writing real MySQL queries.

MySQL Joins

A join is used to combine rows from two or more tables based on a related column between them. In relational databases, data is normally divided into multiple tables to avoid duplication and keep the design clean. For example, customers may be stored in one table, while orders are stored in another table.

Without joins, you would need to query each table separately and connect the data manually in your application. With joins, MySQL can return related data in one query.

Assume we have two simple tables:

customers
+----+---------+
| id | name    |
+----+---------+
| 1  | Adnan   |
| 2  | Noor    |
| 3  | Sara    |
+----+---------+

orders
+----+-------------+--------+
| id | customer_id | total  |
+----+-------------+--------+
| 1  | 1           | 120.00 |
| 2  | 1           | 80.00  |
| 3  | 2           | 50.00  |
+----+-------------+--------+

The relationship is clear: orders.customer_id refers to customers.id. A join allows us to show the customer name beside each order.

SELECT
    customers.name,
    orders.total
FROM customers
JOIN orders
    ON customers.id = orders.customer_id;

The ON condition tells MySQL how the tables are related. Without a correct join condition, the result can become incorrect or extremely large.

In professional projects, joins are used constantly in reports, dashboards, admin panels, search pages, invoices, user permissions, product catalogs, and analytics queries.

MySQL INNER JOIN

INNER JOIN returns only the rows that have matching values in both tables. If a customer has no orders, that customer will not appear in the result. If an order references a missing customer, that order will not appear either.

SELECT
    c.id,
    c.name,
    o.id AS order_id,
    o.total
FROM customers AS c
INNER JOIN orders AS o
    ON c.id = o.customer_id;

Here, c and o are table aliases. Aliases make queries shorter and easier to read, especially when several tables are joined.

Use INNER JOIN when you only want records that exist on both sides of the relationship. Common examples include:

  • Orders that belong to existing customers.

  • Comments that belong to existing posts.

  • Products that belong to existing categories.

  • Payments that belong to existing invoices.

In MySQL, writing JOIN without specifying the type usually means INNER JOIN.

MySQL LEFT JOIN

LEFT JOIN returns all rows from the left table and the matching rows from the right table. If there is no match, MySQL returns NULL for the right table columns.

This is useful when the main table must always appear, even if related records do not exist.

SELECT
    c.id,
    c.name,
    o.id AS order_id,
    o.total
FROM customers AS c
LEFT JOIN orders AS o
    ON c.id = o.customer_id;

This query returns all customers. Customers who have orders will appear with order data. Customers who do not have orders will still appear, but the order columns will contain NULL.

LEFT JOIN is commonly used to find missing related data. For example, to find customers who have never placed an order:

SELECT
    c.id,
    c.name
FROM customers AS c
LEFT JOIN orders AS o
    ON c.id = o.customer_id
WHERE o.id IS NULL;

The condition WHERE o.id IS NULL keeps only the rows where no matching order exists.

A common mistake is putting right-table filters in the WHERE clause when using LEFT JOIN. For example, if you write WHERE o.status = 'paid', the query may behave like an INNER JOIN because rows without orders have NULL values. To preserve the LEFT JOIN behavior, put optional right-side filters inside the ON condition when appropriate.

SELECT
    c.id,
    c.name,
    o.total
FROM customers AS c
LEFT JOIN orders AS o
    ON c.id = o.customer_id
   AND o.status = 'paid';

MySQL RIGHT JOIN

RIGHT JOIN returns all rows from the right table and the matching rows from the left table. If there is no match, MySQL returns NULL for the left table columns.

SELECT
    c.name,
    o.id AS order_id,
    o.total
FROM customers AS c
RIGHT JOIN orders AS o
    ON c.id = o.customer_id;

This query returns all orders, even if some orders do not have a matching customer. In a well-designed database with foreign key constraints, that situation should normally be prevented.

RIGHT JOIN is valid SQL, but many developers prefer to rewrite it as a LEFT JOIN by switching the table order. LEFT JOIN is usually easier to read because the main table appears first.

SELECT
    c.name,
    o.id AS order_id,
    o.total
FROM orders AS o
LEFT JOIN customers AS c
    ON c.id = o.customer_id;

This version usually communicates the intention more clearly: return all orders and attach customer data when available.

MySQL CROSS JOIN

CROSS JOIN returns the Cartesian product of two tables. This means every row from the first table is combined with every row from the second table.

If the first table has 3 rows and the second table has 4 rows, the result will contain 12 rows.

SELECT
    s.size,
    c.color
FROM sizes AS s
CROSS JOIN colors AS c;

This can be useful when generating all possible combinations, such as product sizes and colors, calendar dates and time slots, or available options in a configuration system.

CROSS JOIN must be used carefully. On large tables, it can create a huge result very quickly. For example, joining 10,000 rows with 10,000 rows produces 100,000,000 combinations.

MySQL Self Join

A self join is a join where a table is joined to itself. It is useful when rows in the same table are related to other rows in that same table.

A common example is an employees table where each employee may have a manager. The manager is also an employee in the same table.

SELECT
    e.name AS employee_name,
    m.name AS manager_name
FROM employees AS e
LEFT JOIN employees AS m
    ON e.manager_id = m.id;

The table is used twice, but with different aliases: e for employees and m for managers. Without aliases, MySQL would not know which version of the table you mean.

Self joins are also used for category trees, referral systems, parent-child relationships, threaded comments, organizational structures, and hierarchical data.

MySQL UNION

UNION combines the result of two or more SELECT queries into one result set. The SELECT statements must return the same number of columns, and the columns should have compatible data types.

SELECT email FROM customers
UNION
SELECT email FROM subscribers;

This query returns a combined list of emails from customers and subscribers. By default, UNION removes duplicate rows.

UNION is useful when the data comes from different sources but should be displayed as one list. For example, you may combine active users and invited users, old orders and new orders, or search results from multiple tables.

The column names in the final result usually come from the first SELECT statement.

SELECT name AS person_name, email FROM customers
UNION
SELECT full_name, email_address FROM leads;

Although the second SELECT uses different column names, the final output uses the aliases from the first SELECT.

If you need to sort the final combined result, place ORDER BY at the end of the full UNION query, not inside each SELECT unless you are using subqueries for a specific reason.

SELECT name, email FROM customers
UNION
SELECT full_name, email_address FROM leads
ORDER BY name;

MySQL UNION ALL

UNION ALL also combines multiple SELECT results, but it does not remove duplicates. This makes it faster than UNION in many cases because MySQL does not need to perform duplicate elimination.

SELECT email FROM customers
UNION ALL
SELECT email FROM subscribers;

Use UNION ALL when duplicate rows are acceptable or meaningful. For example, if you are combining logs from different tables, repeated values may represent real separate events and should not be removed.

A practical rule is simple: use UNION when you specifically need unique rows; use UNION ALL when you want all rows exactly as they come from the SELECT statements.

MySQL GROUP BY

GROUP BY groups rows that have the same value in one or more columns. It is usually used with aggregate functions such as COUNT(), SUM(), AVG(), MIN(), and MAX().

For example, to calculate the total sales amount for each customer:

SELECT
    customer_id,
    SUM(total) AS total_spent
FROM orders
GROUP BY customer_id;

This query does not return one row per order. It returns one row per customer, because the rows are grouped by customer_id.

You can also group by multiple columns. For example, sales per customer per status:

SELECT
    customer_id,
    status,
    COUNT(*) AS orders_count,
    SUM(total) AS total_amount
FROM orders
GROUP BY customer_id, status;

GROUP BY is heavily used in reporting and analytics. It helps answer questions such as:

  • How many orders does each customer have?

  • What is the total revenue per month?

  • How many posts are in each category?

  • What is the average salary per department?

When MySQL is running with strict grouping rules, every selected column should either be part of GROUP BY or be used inside an aggregate function, unless MySQL can determine that the column is functionally dependent on the grouped column. This rule prevents unclear results.

MySQL HAVING

HAVING filters grouped results. It is similar to WHERE, but WHERE filters rows before grouping, while HAVING filters groups after aggregation.

For example, to show only customers whose total spending is greater than 1000:

SELECT
    customer_id,
    SUM(total) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(total) > 1000;

WHERE cannot be used directly with aggregate results like SUM(total) in the same way because WHERE runs before the groups are created.

A common pattern is to use WHERE for normal row filters and HAVING for aggregate filters:

SELECT
    customer_id,
    COUNT(*) AS paid_orders
FROM orders
WHERE status = 'paid'
GROUP BY customer_id
HAVING COUNT(*) >= 3;

In this query, WHERE first keeps only paid orders. Then GROUP BY groups those paid orders by customer. Finally, HAVING keeps only customers with at least three paid orders.

MySQL EXISTS

EXISTS checks whether a subquery returns at least one row. It returns true if the subquery finds any matching record.

For example, to get customers who have at least one order:

SELECT
    c.id,
    c.name
FROM customers AS c
WHERE EXISTS (
    SELECT 1
    FROM orders AS o
    WHERE o.customer_id = c.id
);

The subquery is connected to the outer query through o.customer_id = c.id. This is called a correlated subquery because the inner query depends on the current row from the outer query.

EXISTS is often used when you care about the existence of related rows, not about returning values from them.

To find customers with no orders, use NOT EXISTS:

SELECT
    c.id,
    c.name
FROM customers AS c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders AS o
    WHERE o.customer_id = c.id
);

NOT EXISTS is usually a clear and safe way to express anti-join logic, especially when NULL values could make NOT IN behave unexpectedly.

MySQL ANY

ANY compares a value with any value returned by a subquery. It returns true if the comparison is true for at least one value in the subquery result.

ANY is used with comparison operators such as =, >, <, >=, <=, and <>.

SELECT
    product_name,
    price
FROM products
WHERE price > ANY (
    SELECT price
    FROM products
    WHERE category_id = 5
);

This query returns products whose price is greater than at least one product price in category 5.

Another example: find orders whose total equals any total from high-priority orders:

SELECT
    id,
    total
FROM orders
WHERE total = ANY (
    SELECT total
    FROM orders
    WHERE priority = 'high'
);

In many situations, = ANY is similar to IN. However, ANY is more general because it can be used with different comparison operators.

MySQL ALL

ALL compares a value with every value returned by a subquery. It returns true only if the comparison is true for all values in the subquery result.

SELECT
    product_name,
    price
FROM products
WHERE price > ALL (
    SELECT price
    FROM products
    WHERE category_id = 5
);

This query returns products whose price is greater than every product price in category 5. In other words, the product price must be higher than the maximum price found in that category.

ALL is useful when the logic requires a value to pass a comparison against the entire subquery result, not only one row.

ANY and ALL are powerful, but they should be used carefully. In many cases, the same logic can be written with MIN(), MAX(), EXISTS, or joins, which may be easier for other developers to understand.

MySQL INSERT SELECT

INSERT SELECT inserts data into a table using the result of a SELECT query. It is useful when copying data, archiving records, creating reports, migrating selected rows, or preparing summary tables.

INSERT INTO archived_orders (order_id, customer_id, total, archived_at)
SELECT
    id,
    customer_id,
    total,
    NOW()
FROM orders
WHERE status = 'completed'
  AND created_at < '2025-01-01';

The columns in the INSERT list must match the number and compatible types of the values returned by the SELECT query.

You can also insert into one table from another table with the same structure:

INSERT INTO customers_backup (id, name, email)
SELECT id, name, email
FROM customers;

Before running INSERT SELECT on production data, it is wise to run the SELECT part alone first to confirm exactly which rows will be inserted.

MySQL CASE

CASE adds conditional logic inside a SQL query. It allows you to return different values depending on conditions.

CASE is useful for readable labels, calculated columns, custom sorting, conditional aggregation, and report formatting.

SELECT
    id,
    total,
    CASE
        WHEN total >= 1000 THEN 'High'
        WHEN total >= 500 THEN 'Medium'
        ELSE 'Low'
    END AS order_value
FROM orders;

This query creates a calculated column named order_value. The value depends on the order total.

CASE can also be used with GROUP BY reports. For example, count orders by value level:

SELECT
    CASE
        WHEN total >= 1000 THEN 'High'
        WHEN total >= 500 THEN 'Medium'
        ELSE 'Low'
    END AS value_level,
    COUNT(*) AS orders_count
FROM orders
GROUP BY value_level;

CASE does not change the stored data. It only changes how values are returned in the query result.

There are two common forms of CASE: searched CASE and simple CASE. Searched CASE checks full conditions. Simple CASE compares one expression against multiple values.

SELECT
    id,
    status,
    CASE status
        WHEN 'paid' THEN 'Payment received'
        WHEN 'pending' THEN 'Waiting for payment'
        WHEN 'cancelled' THEN 'Order cancelled'
        ELSE 'Unknown status'
    END AS status_label
FROM orders;

MySQL Null Functions

NULL means missing, unknown, or not applicable. It is not the same as zero, an empty string, or false. Because NULL represents an unknown value, normal comparisons can behave differently than beginners expect.

For example, this condition is wrong:

SELECT * FROM users
WHERE deleted_at = NULL;

To check for NULL, use IS NULL or IS NOT NULL:

SELECT * FROM users
WHERE deleted_at IS NULL;

MySQL provides several functions for working with NULL values.

IFNULL() returns an alternative value when the first value is NULL:

SELECT
    name,
    IFNULL(phone, 'No phone number') AS phone_display
FROM customers;

COALESCE() returns the first non-NULL value from a list. It is useful when several fallback values are possible:

SELECT
    name,
    COALESCE(work_phone, mobile_phone, home_phone, 'No phone') AS contact_phone
FROM customers;

NULLIF() returns NULL if two expressions are equal; otherwise, it returns the first expression:

SELECT
    product_name,
    NULLIF(discount_price, regular_price) AS real_discount_price
FROM products;

Null functions are important in reports because aggregate calculations and display values can become confusing if NULL values are ignored or handled incorrectly.

MySQL Stored Procedures

A stored procedure is a reusable block of SQL code stored inside the database. Instead of sending the same SQL logic from the application every time, you can call the procedure by name.

Stored procedures can accept parameters, execute multiple statements, use variables, include conditional logic, and return result sets.

DELIMITER //

CREATE PROCEDURE GetCustomerOrders(IN customerId INT)
BEGIN
    SELECT
        id,
        total,
        status,
        created_at
    FROM orders
    WHERE customer_id = customerId
    ORDER BY created_at DESC;
END //

DELIMITER ;

After creating the procedure, you can call it like this:

CALL GetCustomerOrders(1);

The DELIMITER command is used in many MySQL clients because the procedure body contains semicolons. Temporarily changing the delimiter allows the whole procedure definition to be treated as one statement.

Stored procedures can be useful for database-side operations, repeated reporting logic, legacy systems, and controlled access to specific operations. However, they should not be overused as a replacement for clean application architecture. In many modern web applications, business logic is often kept in the application layer, while stored procedures are used only when they provide a clear benefit.

A stored procedure with an output parameter can look like this:

DELIMITER //

CREATE PROCEDURE CountCustomerOrders(
    IN customerId INT,
    OUT ordersCount INT
)
BEGIN
    SELECT COUNT(*)
    INTO ordersCount
    FROM orders
    WHERE customer_id = customerId;
END //

DELIMITER ;

CALL CountCustomerOrders(1, @count);
SELECT @count AS orders_count;

MySQL Comments

Comments are notes written inside SQL code. MySQL ignores them during execution. They help explain complex queries, document decisions, or temporarily disable part of a query during testing.

MySQL supports single-line comments using -- or #:

-- Get active customers only
SELECT *
FROM customers
WHERE status = 'active';

# This is also a single-line comment
SELECT COUNT(*) FROM orders;

MySQL also supports multi-line comments using /* ... */:

/*
  This query calculates total revenue
  from paid orders only.
*/
SELECT
    SUM(total) AS revenue
FROM orders
WHERE status = 'paid';

Comments should explain why something is done, not repeat what the SQL already says. Good comments are helpful in complex joins, reports, migrations, and maintenance scripts.

MySQL Operators

Operators are symbols or keywords used to perform calculations, comparisons, logical conditions, and pattern matching. They are part of almost every SQL query.

Arithmetic operators are used for calculations:

SELECT
    price,
    quantity,
    price * quantity AS line_total
FROM order_items;

Common arithmetic operators include +, -, *, /, and %.

Comparison operators compare values:

SELECT *
FROM products
WHERE price >= 100
  AND stock <> 0;

Common comparison operators include =, <>, !=, >, <, >=, and <=.

Logical operators combine conditions:

SELECT *
FROM orders
WHERE status = 'paid'
  AND total > 500;

The most common logical operators are AND, OR, and NOT. Parentheses should be used when combining AND and OR to avoid unclear logic.

SELECT *
FROM orders
WHERE status = 'paid'
  AND (total > 500 OR priority = 'high');

Pattern and range operators are used for flexible filtering:

SELECT *
FROM customers
WHERE name LIKE 'A%'
  AND city IN ('Istanbul', 'Samsun', 'Ankara')
  AND created_at BETWEEN '2025-01-01' AND '2025-12-31';

Important SQL operators and expressions include LIKE, IN, BETWEEN, IS NULL, IS NOT NULL, EXISTS, ANY, and ALL.

MySQL also supports the NULL-safe equality operator <=>. It can compare values even when NULL is involved:

SELECT *
FROM users
WHERE deleted_at <=> NULL;

In normal application queries, IS NULL is usually clearer. However, the NULL-safe operator is useful in some comparison and synchronization scenarios.

How These MySQL Features Work Together

In real applications, these features are rarely used in isolation. A report query may join multiple tables, group results, filter aggregated values with HAVING, handle NULL values with COALESCE, and classify rows with CASE.

The following example combines several concepts in one practical report:

SELECT
    c.id,
    c.name,
    COALESCE(c.city, 'Unknown') AS city,
    COUNT(o.id) AS orders_count,
    SUM(CASE WHEN o.status = 'paid' THEN o.total ELSE 0 END) AS paid_total,
    CASE
        WHEN SUM(CASE WHEN o.status = 'paid' THEN o.total ELSE 0 END) >= 5000 THEN 'VIP'
        WHEN COUNT(o.id) >= 5 THEN 'Regular'
        ELSE 'New'
    END AS customer_level
FROM customers AS c
LEFT JOIN orders AS o
    ON c.id = o.customer_id
GROUP BY
    c.id,
    c.name,
    c.city
HAVING paid_total > 0
ORDER BY paid_total DESC;

This query returns customers with paid order activity, shows their city with a fallback value, counts their orders, calculates paid revenue, and assigns a customer level. It uses LEFT JOIN, GROUP BY, HAVING, COALESCE, CASE, COUNT(), SUM(), aliases, and ordering.

This is the type of query that appears in dashboards, CRM systems, analytics pages, ecommerce admin panels, and financial reports.

Performance and Best Practices

Advanced SQL features are powerful, but they must be written carefully. A query can be logically correct and still perform badly if the database grows.

Important practices include:

  • Use indexes on join columns, filter columns, and frequently grouped columns when appropriate.

  • Always write clear ON conditions for joins.

  • Avoid SELECT * in reports and joined queries unless you truly need every column.

  • Use UNION ALL instead of UNION when duplicate removal is not required.

  • Use WHERE before GROUP BY to reduce the number of rows being grouped.

  • Use HAVING only for conditions that depend on aggregate results.

  • Test SELECT queries before using them inside INSERT SELECT.

  • Use aliases consistently to improve readability.

  • Use EXPLAIN to inspect how MySQL plans to execute important queries.

Readable SQL is also a professional skill. A query that is formatted well, uses meaningful aliases, and separates logic clearly will be easier to debug, optimize, and maintain.

Conclusion

MySQL joins and advanced SQL features are essential for building real database-driven applications. INNER JOIN, LEFT JOIN, RIGHT JOIN, CROSS JOIN, and self joins help combine related data. UNION and UNION ALL combine results from multiple SELECT queries. GROUP BY and HAVING make reporting and aggregation possible.

EXISTS, ANY, and ALL allow more advanced subquery logic. INSERT SELECT helps copy and transform data between tables. CASE adds conditional output. NULL functions make missing values safer and clearer. Stored procedures provide reusable database-side logic. Comments improve maintainability, and operators form the foundation of filtering, comparison, and calculation.

To master MySQL, practice these topics with real tables and realistic data. Start with simple joins, then add grouping, filtering, CASE expressions, and subqueries step by step. The more you understand how these features work together, the easier it becomes to write clean, accurate, and efficient SQL for real projects.