msgbartop
News, views, tips and tricks on Oracle and other fun stuff
msgbarbottom

SQL problem

I have noticed this interesting thread on the Oracle-l mailing list about a “seemingly simple SQL requirement”. The question was about how to write a query that returned NO rows from a table t – with an id column – whenever a given list of id’s contained at least one id that did not exist in the table t. To illustrate:

scott@EDDEV> create table t(id number);

Table created.

scott@EDDEV> insert into t values(1);

1 row created.

scott@EDDEV> insert into t values(2);

1 row created.

scott@EDDEV> select * from t;

        ID
----------
         1
         2

scott@EDDEV> select * from t where id in (1,2);

        ID
----------
         1
         2

scott@EDDEV> select * from t where id in (1,2,3);

        ID
----------
         1
         2

The requirement was that the last query above should not return any rows because the id 3 does not exist in the table t. Here are the three solutions that were presented and my take on each one:

First, this query:

scott@EDDEV> select id
  2  from (select id, count(id) over () rows_in_set
  3       from t
  4       where id in (1, 2, 3))
  5  where rows_in_set = 3;

no rows selected

Neat. But there are two issues with this query. First, you need to know how many id’s you have in your id list before running the query, that requires another hard coded input to the query. Second, it does not work as expected if there are duplicate id’s in the id list:

scott@EDDEV> select id
  2  from (select id, count(id) over () rows_in_set
  3       from t
  4       where id in (1,2,2))
  5  where rows_in_set = 3;

no rows selected

Based on the requirement, since all id numbers in the id’s list exist in t, the above query should have returned rows 1 and 2.

On to the second offered solution:

scott@EDDEV> select * from t
  2  where exists(select 1 from t where id=1)
  3  and   exists(select 1 from t where id=2)
  4  and   exists(select 1 from t where id=3);

no rows selected

Works great, but does not scale (no binding). What if instead of 1,2 and 3, we had 100 id’s?

The last solution was:

scott@EDDEV> with
  2  x as
  3    (select
  4      length(
  5        replace(
  6          translate('&&1','1234567890','           '),
  7          ' ')
  8        )+1 c1
  9     from dual)
 10  ,y as
 11    (select count(*) c2
 12     from t where id in (&1))
 13  select t.*
 14    from t, x a, y b
 15   where id in (&1)
 16     and a.c1 = b.c2;
Enter value for 1: 1,2
old   6:         translate('&&1','1234567890','           '),
new   6:         translate('1,2','1234567890','           '),
old  12:    from t where id in (&1))
new  12:    from t where id in (1,2))
old  15:  where id in (&1)
new  15:  where id in (1,2)

        ID
----------
         1
         2

scott@EDDEV> define 1 = 1,2,3
scott@EDDEV> /
old   6:         translate('&&1','1234567890','           '),
new   6:         translate('1,2,3','1234567890','           '),
old  12:    from t where id in (&1))
new  12:    from t where id in (1,2,3))
old  15:  where id in (&1)
new  15:  where id in (1,2,3)

no rows selected

Very nice. Basically, the concept is the same as the first solution but the query is written in a more generic way. The “x” query returns the number of elements in the list of id’s. That’s the hardcoded value in the first solution above. Like the first solution, this solution does not work as expected if there are duplicate id’s in the id list:

scott@EDDEV> define 1 = 2,2
scott@EDDEV> /
old   6:         translate('&&1','1234567890','           '),
new   6:         translate('2,2','1234567890','           '),
old  12:    from t where id in (&1))
new  12:    from t where id in (2,2))
old  15:  where id in (&1)
new  15:  where id in (2,2)

no rows selected

The above query also assumes that the column is always numeric and the delimiter is always a comma (otherwise the “y” query will return wrong results.) Also note that the “WITH” clause is only supported in Oracle version 9i and above.


Filed in Oracle on 11 Oct 05 | Tags:


Reader's Comments

  1. |

    There is a more generic way, but it would involve creating a SQL type and a function to convert comma-seperated list to a collection. Then using set logic you can satisfy the requirement and it will handle duplicate values as well in the input. The solution given below has been tested using the table structure as you have shown above, but it can handle any type of table:

    create or replace type tablist as table of varchar2(255);
    
    create or replace function my_comma_to_table(csv varchar2)
    return tablist is
      l_string        long default csv || ',';
      l_data          tablist := tablist();
      n               number;
    begin
    loop
      exit when l_string is null;
      n := instr( l_string, ',' );
      l_data.extend;
      l_data(l_data.count) := 
        ltrim( rtrim( substr( l_string, 1, n-1 ) ) );
      l_string := substr( l_string, n+1 );
    end loop;
    return l_data;
    end my_comma_to_table;
    
    select * from t
    where 0=(select cnt_a-cnt_b
        from (select count(*) cnt_a 
        from (select to_number(column_value) 
        from table(cast (my_comma_to_table('1,2,2') as tablist)) 
        union select id from t)) a,
                  (select count(*) cnt_b from t) b
         )
    /
    

    The above would handle even when the list contains alphanumeric values. In that case just remove the to_number function in the select query.

    Regards Partha

  2. |

    Partha, thanks, your solution works great (even though it needed some PL/SQL help). However, a minor modification to the query is needed to make it work for both numeric and alphanumeric input. In addition to removing the to_number, a to_char needs to be added as well. that way, it’ll work for both numeric and alphanumeric all the time:

    scott: select * from t
      2  where 0=(select cnt_a-cnt_b
      3      from (select count(*) cnt_a
      4      from (select column_value
      5      from table(cast (my_comma_to_table('1,a,2') as tablist))
      6      union select id from t)) a,
      7                (select count(*) cnt_b from t) b
      8       )
      9  /
        from (select column_value
                     *
    ERROR at line 4:
    ORA-01790: expression must have same datatype as 
    corresponding expression
    
    
    scott: select * from t
      2  where 0=(select cnt_a-cnt_b
      3      from (select count(*) cnt_a
      4      from (select column_value
      5      from table(cast (my_comma_to_table('1,a,2') as tablist))
      6      union select to_char(id) from t)) a,
      7                (select count(*) cnt_b from t) b
      8       )
      9  /
    
    no rows selected
    
  3. |

    How about these two methods:

    create table id_tab (id number);
    insert into id_tab select rn 
        from (select rownum rn from dba_objects) where mod(rn, 20) != 0;
    insert into id_tab select rn 
        from (select rownum rn from dba_objects) where mod(rn, 10) != 0;
    insert into id_tab select rn 
        from (select rownum rn from dba_objects) where mod(rn, 5) != 0;
    commit;
    
    create table id_domain (id number);
    insert into id_domain select rownum from dba_objects where rownum 
    
  4. |

    OK, another (shorter) try at posting this (id_domain contains 1..1000):

    select id
    from (
        select t.id, count(distinct t.id) over () cnt1, cnt2
        from (
            select id, count(id) over () cnt2
            from id_domain
            where id in (1,2,17,17)
        ) p, id_tab t
        where t.id = p.id
    )
    where cnt1 = cnt2;
    
    
    select id
    from (
        select id
        from id_tab
        where id in (1,2,17,17)
    )
    where not exists (
        select null
        from id_domain
        where id in (1,2,17,17)
            and id not in (
                select id
                from id_tab
            )
    );
    
  5. |

    Perhaps simplistic, but here’s my take on this problem:

    jeffkemponoracle.blogspot.com/2005/10/sql-problem.html

  6. |

    It uses a collection type (but in my case I use the predefined one). Depends on whether they want to pass the list as a single variable (which will need parsing into its components using PL/SQL as per Partha’s solution).

    create table t(id number);
    
    insert into t values(1);
    
    insert into t values(2);
    
    select * from 
    (select t.id, x.column_value x, 
    count(distinct x.column_value) over () count_x,
    count(distinct t.id) over () count_t
    from t, table(sys.dbms_debug_vc2coll(1,2,3)) x
    where t.id(+) = to_number(x.column_value))
    where count_x = count_t
    /
    
    select * from 
    (select t.id, x.column_value x, 
    count(distinct x.column_value) over () count_x,
    count(distinct t.id) over () count_t
    from t, table(sys.dbms_debug_vc2coll(1,2)) x
    where t.id(+) = to_number(x.column_value))
    where count_x = count_t
    /
    
    select * from 
    (select t.id, x.column_value x, 
    count(distinct x.column_value) over () count_x,
    count(distinct t.id) over () count_t
    from t, table(sys.dbms_debug_vc2coll(1)) x
    where t.id(+) = to_number(x.column_value))
    where count_x = count_t
    /
    
  7. |

    There is a problem with Gary’s solution as it would give duplicate values if the collection contained any duplications. As per the requirement, we cannot have duplicate rows returned from t even if the values appear duplicate in the input list or collection(whatever you might call it).

    Jeff’s solution was what I wanted to acheive, but really could not get it working. I got around to it using Set Union, but not a very neat solution. Please note the trick for removing duplicate values was using a Set operation. Jeff’s solution could be further improved by removing the where id in clause completely. The Set minus operation does the job neatly.

    So using the improvement on Jeff’s solution and using any method to manipulate the input list, the solution could be as below :

    select * from t
    where not exists (
        select to_number(column_name) 
        from Input_Table 
        minus 
        select id from t)
    

    The Input_Table can be computed from input list of values either using PL/SQL method (Partha), sys.dbms_debug_vc2coll (Gary) or temporary table method (Jeff). This would be driven by the assumptions you make of the input data set.

    The column_name would be column_name in case of PL/SQL method, column_value in case of dbms_debug_vc2coll method or the column name of the temporary table in case of temporary table method.

  8. |

    Great post! That is an interesting SQL puzzler!

  9. |

    If you remove the “id in (select id from u)” (or the equivalent “exists (…)”, won’t the query return values not in the query set? I.E.:

    SQL> select * from t;
            ID
    ----------
             1
             2
             3
    
    SQL> select * from u;
            ID
    ----------
             1
             2
             3
    
    SQL> insert into t values (4);
    1 row created.
    
    SQL> select * from t 
        where not exists (select id from u minus select id from t);
            ID
    ----------
             1
             2
             4
             3
    
    SQL> select * from t where id in (select id from u) 
        and not exists (select id from u minus select id from t);
            ID
    ----------
             1
             2
             3
    

    – in the first query, 4 is incorrectly returned, yes?

  10. |

    After re-reading the original post it appears it was not a requirement to limit the rows returned unless all the query rows existed in it; so my last comment is probably superfluous if it doesn’t matter if all the rows get returned.

    Another case of needing to get the exact user requirements…

  11. |

    select * from t a where a.id in (&1) and not exists ( select * from table(sys.KU$_OBJNUMSET(&1)) b where not exists ( select * from t c where c.id=b.column_value))