Keep first, Keep last

A famous person once said: Oracle’s analytic SQL functions rock, Oracle’s analytic SQL functions roll. I totally agree. However, I would like to add to this that they rock and roll on a much faster beat if you know more about them. To explain what I mean, let’s take an example:

HR@XE> create table t (n number)
  2  /

Table created.

HR@XE> insert into t (n) values (100)
  2  /

1 row created.

HR@XE> insert into t (n) values (200)
  2  /

1 row created.

HR@XE> insert into t (n) values (300)
  2  /

1 row created.

HR@XE> insert into t (n) values (300)
  2  /

1 row created.

HR@XE> commit
  2  /

Commit complete.

HR@XE> begin
  2    dbms_stats.gather_table_stats('HR','T');
  3  end;
  4  /

PL/SQL procedure successfully completed.

HR@XE> select n from t order by n
  2  /

         N
----------
       100
       200
       300
       300

Now, I want to return the most frequent value of N (or, what’s called in statistics: mode). In the example above, it is 300. The following query, using the analytic function dense_rank, will do that for us:

HR@XE> select n
  2  from
  3    (select n,
  4       dense_rank() over(
  5     order by cnt desc) as
  6    rnk
  7     from
  8      (select n,
  9         count(*) as
 10      cnt
 11       from t
 12       group by n)
 13    x)
 14  y
 15  where rnk = 1
 16  /

         N
----------
       300


Execution Plan
----------------------------------------------------------
Plan hash value: 4148838322

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     3 |    78 |     5  (40)| 00:00:01 |
|*  1 |  VIEW                    |      |     3 |    78 |     5  (40)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |     3 |     9 |     5  (40)| 00:00:01 |
|   3 |    HASH GROUP BY         |      |     3 |     9 |     5  (40)| 00:00:01 |
|   4 |     TABLE ACCESS FULL    | T    |     4 |    12 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNK"=1)
   2 - filter(DENSE_RANK() OVER ( ORDER BY COUNT(*) DESC )<=1)

Now, did you know that you could tune the above query by using an additional aggregate function called: FIRST.

KEEP … FIRST/LAST aggregate functions allow you to return the result of an aggregate applied over a set of rows that rank as the first or last with respect to a given order specification.

I will rewrite the above query like this:

HR@XE> select max(n) keep(dense_rank first
  2  order by cnt desc) n
  3  from
  4    (select n,
  5       count(*) cnt
  6     from t
  7     group by n)
  8  /

         N
----------
       300


Execution Plan
----------------------------------------------------------
Plan hash value: 548494858

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    26 |     4  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE      |      |     1 |    26 |            |          |
|   2 |   VIEW               |      |     3 |    78 |     4  (25)| 00:00:01 |
|   3 |    HASH GROUP BY     |      |     3 |     9 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T    |     4 |    12 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

So, getting rid of the subquery and replacing it with KEEP… FIRST made a difference (as you can see from the Execution Plan), and gave us the same result. Sometimes, reading, or just quickly scanning, the documentation or other technical books and websites, is a good thing. It opens your eyes to some hidden, less used, options or commands or functions… that can come in handy in many situations like the one I descibed above.

Related articles:


Tagged , , | Post a Comment