Group Functions and SubQueries

This page will roughly cover the five GROUP functions, GROUP BY, and using SubQueries.

Concepts

GROUP BY

We’ve previously seen that Group Functions summarize values in an interim answer set return values grouped across many rows. The GROUP BY clause enables SQL to provide subtotals. It can perform the group function across a subset of rows in the interim answer set and provide a total for each subset of rows.

Common Group Functions

  • SUM: can only be used on numeric columns
  • AVG: can only be used on numeric columns
  • COUNT: can be used with any data type column (or with * to simply count rows)
  • MIN: can be used with any data type column
  • MAX: can be used with any data type column

To avoid mismatch, remember this rule:

When using a GROUP BY, every column/expression in the SELECT statement must either be:

  • a group function
  • a column that is being grouped by

Also remember that even though the group function is sorting the interim answer set in order to calculate the totals, we must still use an order by clause to list the totals in group sequence.

HAVING

Defines a condition that selects a subset of rows from the answer set based on the group functions.

When using a group by, the having is like a where clause against the interim answer set.

The syntax rules for HAVING are the same as for WHERE.

SubQuery

Simply a a query within a query, and has 3 modes of use:

  1. Subquery in WHERE: answer set to the inner query is used as a predicate in a where clause in the outer query.
  2. Subquery in SELECT: answer set to the inner query is used as a column in a select.
  3. Subquery in FROM: answer set to inner query is used as a virtual table in a from clause.

Subqueries must always be enclosed in parentheses.

Examples

Example 1


Query

Recall this will generate an error:

select order_id, product_id, sum(unit_price), sum(quantity)
from order_details
where order_id in (10248, 10249, 10250, 10251);

Result: ERROR, the level of the group function must match the level of detail in the select statement.

By adding a GROUP BY clause, we avoid the mismatch and get subtotals by order_id.

Query

select order_id, product_id, sum(unit_price), sum(quantity)
from order_details
where order_id in (10248, 10249, 10250, 10251)
group by order_id;

Result: subtotals by the specified groups within the variable specified by the group by clause.


Example 2


Query

select country, count(customer_id) as total
from customers
group by country;

Result: returns the number of customers in each country.


Example 3


Query

select supplier_id, avg(unit_price) as total
from products
group by supplier_id;

Result: returns the average product price for each supplier.


Example 4


Query

select supplier_id, sum(units_in_stock) as inventory
from products
group by supplier_id
order by 2 desc limit 1;

Result: returns the supplier that northwinds carries the most inventory from.


Example 4


Query

select country, count(customer_id) as total
from customers
group by country
having count(customer_id) > 5
order by 2 desc;

Result: returns the countries in which northwinds has more than five customers.


Result: returns the supplier that northwinds carries the most inventory from.


Example 5


Using the common Group Functions.

Query

select COUNT(*) as "Staff Size"
from employees;

select COUNT(Distinct country) as "Countries"
from customer;

select SUM(units_in_stock) as "Inventory"
from inventory;

select MAX(unit_price) as "High Price"
from products;

select MIN(unit_price) as "Low Price"
from products;

Result: returns results from database columns using the common group functions.


Example 6


Query

select to_char(avg(unit_price), '999,999.99') as "Average Price"
from products;

Result: returns rounded average price from unit price.


Example 7


Predicate in a WHERE clause with the = operator.

Query

select product_id, product_name, unit_price
from products
where unit_price = (
  select max(unit_price)
  from products
);

Result: returns the product_id, name, and unit_price of the highest priced product.


Example 8


Predicate in a WHERE clause with the in operator.

Note: this uses TWO different tables.

Query

select customer_id, order_id
from orders
where order_id in (
  select order_id
  from order_details
  where quantity > 100
);

Result: returns the order_ids from the orders database where order quantity is above 100 from the order_details database.


Example 9


Query

select product_name,
(select sum(unit_price * quantity)
  from order_details
  where order_details.product_id = products_order_id) as "Total
from products;

Result: list each product name and the total value of that product’s order.


Example 10


Query

select order_id
from (select order_id, sum(quantity)
  from order_details
  group by order_id
  having sum(quantity) < 100)
as detail_count;

Result: create a list of all orders with fewer than 100 items sold (note: the subquery must have an alias name).