sql WITH clause trick


Starting in Oracle9i release 2 we saw an incorporation of the SQL-99 WITH clause (a.k.a. subquery factoring), a tool for materializing subqueries to save Oracle from having to re-compute them multiple times.

The SQL WITH clause is very similar to the use of Global temporary tables (GTT), a technique that is often used to improve query speed for complex subqueries. Here are some important notes about the Oracle WITH clause:

   • The SQL WITH clause only works on Oracle 9i release 2 and beyond.
   • Formally, the WITH clause is called subquery factoring
   • The SQL WITH clause is used when a subquery is executed multiple times
   • Also useful for recursive queries (SQL-99, but not Oracle SQL)

Let’s take a closer look at how the Oracle SQL WITH clause works:

All Stores with above-average sales
To keep it simple, the following example only references the aggregations once, where the SQL WITH clause is normally used when an aggregation is referenced multiple times in a query.

Here is an example of a request to see the names of all stores with above-average sales. For each store, we must compare their average sales to the average sales for all stores.
Essentially, this query accesses the STORE and SALES tables, comparing the sales for each store with the average sales for all stores. To answer this query we must know:

• The total sales for all stores.
• The number of stores.
• The sum of sales for each store.

To answer this in a single SQL statement we need to employ in-line views and also a subquery inside a HAVING clause:

select
   store_name,
   sum(quantity)                                                  store_sales,
   (select sum(quantity) from sales)/(select count(*) from store) avg_sales
from
   store  s,
   sales  sl
where
   s.store_key = sl.store_key
having
   sum(quantity) > (select sum(quantity) from sales)/(select count(*) from store)
group by
   store_name
;

While this query provides the correct answer, it is difficult to read and complex to execute, re-computing the sum of sales multiple times.
To prevent the unnecessary re-execution of the aggregation (sum(sales)), we could create temporary tables and use them to simplify our query.

1 - Create a table t1 to hold the total sales for all stores.
2 - Create a table t2 to hold the number of stores.
3 - Create a table t3 to hold the store name and the sum of sales for each store.
Then, write a fourth SQL statement that uses tables T1, T2, and T3 to replicate the output from the original query.  The final answer will look like this:
create table t1 as
select sum(quantity) all_sales from sales;

create table t2 as
select count(*) nbr_stores from stores;
create table t3 as
select store_name, sum(quantity) store_sales from store natural join sales;
select
   store_name
from
   t1,
   t2,
   t3
where
   store_sales > (all_sales / nbr_stores);


While this is a very elegant solution (i.e. easy to understand) and has faster execution time, we can also use the SQL-99 WITH clause instead of temporary tables.  The Oracle SQL WITH clause will compute the aggregation once, give it a name, and allow us to reference it (maybe multiple times), later in the query.

The SQL-99 WITH clause is very confusing at first because the SQL statement does not begin with the word SELECT. Instead, we use the WITH clause to start our SQL query, defining the aggregations, which can then be named in the main query as if they were “real” tables:

WITH
   subquery_name
AS
  (the aggregation SQL statement)
SELECT
  (query naming subquery_name);


Retuning to our oversimplified example, let’s replace the temporary tables with the SQL “WITH” clause”:

WITH
   sum_sales AS
      ( select
      sum(qty) all_sales from sales ),
   number_stores AS
      ( select
      count(*) nbr_stores from stores ),
   sales_by_store AS
      ( select
      stor_name, sum(qty) store_sales from
      stores natural join sales
      group by stor_name)
SELECT
   stores.stor_name
FROM
   stores,
   sum_sales,
   number_stores,
   sales_by_store
where
   store_sales > (all_sales / nbr_stores);


Note the use of the Oracle undocumented “materialize” hint in the WITH clause.  The Oracle materialize hint is used to ensure that the Oracle cost-based optimizer materializes the temporary tables that are created inside the WITH clause.  This is not necessary in Oracle10g, but it helps ensure that the tables are only created one time.
It should be noted that the WITH clause does not yet fully-functional within Oracle SQL and it does not yet support the use of WITH clause replacement for CONNECT BY when performing recursive queries.

To see how the WITH clause is used in ANSI SQL-99 syntax, here is an excerpt from Jonathan Gennick’s great work “Understanding the WITH Clause” showing the use of the SQL-99 WITH clause to traverse a recursive bill-of-materials hierarchy.

NOTE: This does NOT work (yet) with Oracle SQL

WITH recursiveBOM
   (assembly_id, assembly_name, parent_assembly) AS
(SELECT parent.assembly_id,
        parent.assembly_name,
        parent.parent_assembly
FROM bill_of_materials parent
WHERE parent.assembly_id=100
UNION ALL
SELECT child.assembly_id,
       child.assembly_name,
       child.parent_assembly
FROM recursiveBOM parent, bill_of_materials child
WHERE child.parent_assembly = parent.assembly_id)
SELECT assembly_id, parent_assembly, assembly_name
FROM recursiveBOM;

No comments:

Post a Comment