Re: difference between UNION operator and OR in where clause

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Tue, 10 Jul 2007 17:18:04 -0300
Message-ID: <4693e962$0$8835$9a566e8b_at_news.aliant.net>


Mike wrote:

> On Jul 10, 1:36 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> 

>>Mike wrote:
>>
>>>On Jul 10, 11:18 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>Mike wrote:
>>
>>>>>On Jul 10, 10:40 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>>>Mike wrote:
>>
>>>>>>>On Jul 10, 9:59 am, David Portas
>>>>>>><REMOVE_BEFORE_REPLYING_dpor..._at_acm.org> wrote:
>>
>>>>>>>>On 10 Jul, 14:39, Mike <gongweig..._at_gmail.com> wrote:
>>
>>>>>>>>>tables:
>>>>>>>>>1. publication(pubid, title)
>>>>>>>>>2. book(pubid, date)
>>>>>>>>>3. journal(pubid, date)
>>
>>>>>>>>>purpose: find all publication titles for books or journals.
>>
>>>>>>>>>solution 1: use set "union"
>>
>>>>>>>>>select title
>>
>>>>>>>>>from publication, ( (select pubid from book) UNION (select pubid from
>>
>>>>>>>>>journal)) bj
>>>>>>>>>where publication.pubid = bj.pubid
>>
>>>>>>>>>solution 2: use "or" operator in the where clause
>>
>>>>>>>>>select title
>>
>>>>>>>>>from publication, book, journal
>>
>>>>>>>>>where publication.pubid = book.pubid or publication.pubid =
>>>>>>>>>journal.pubid
>>
>>>>>>>>>This example is taken from some lecture notes about sql. The notes
>>>>>>>>>said the solution 2 was wrong.
>>
>>>>>>>>>Anyone knows why ?
>>
>>>>>>>>Let me pose two more questions for you. What are the keys for each
>>>>>>>>table? What is the difference between ALL and DISTINCT?
>>
>>>>>>>>If you can answer those questions then you may be on the way to the
>>>>>>>>answer you are looking for.
>>
>>>>>>>>--
>>>>>>>>David Portas- Hide quoted text -
>>
>>>>>>>>- Show quoted text -
>>
>>>>>>>The keys for all the 3 tables are "pubid". "ALL" means to allow
>>>>>>>duplicated records, and "DISTINCT" will remove the duplicated ones.
>>
>>>>>>>But I still don't get it.
>>
>>>>>>>Could you give more hints ?
>>
>>>>>>In the second case, if a book id matches a publication id, how many
>>>>>>journals will match the WHERE clause? Remember TRUE OR FALSE = ____
>>>>>>Similarly if a journal id matches the a publication id, how many books
>>>>>>will match the WHERE clause? Remember FALSE OR TRUE = ____
>>
>>>>>>I will let you fill in the blanks.
>>
>>>>>>Consider as well the nullary cases. What happens to the result of the
>>
>>>>>>FROM clause if there are no books at all or if there are no journals at all?- Hide quoted text -
>>
>>>>>>- Show quoted text -
>>
>>>>>In the second case, if a book id matches a publication id, how many
>>>>>journals will match the WHERE clause?
>>
>>>>>0
>>
>>>>>Similarly if a journal id matches the a publication id, how many books
>>>>>will match the WHERE clause?
>>
>>>>>0
>>
>>>>>>Consider as well the nullary cases.
>>
>>>>>This is the case which fails the solution 2, I think.
>>
>>>>It is one case and one mechanism by which solution 2 fails. Go back
>>>>where I left the blanks and fill them in. Then re-read the paragraph
>>>>they appear in.- Hide quoted text -
>>
>>>>- Show quoted text -
>>
>>>My answer for your previous question.
>>
>>>>In the second case, if a book id matches a publication id, how many
>>
>>>journals will match the WHERE clause?
>>
>>>If a book id matches a publication id, 0 journal will match the WHERE
>>>clause.
>>
>>I disagree. No journals will match "publication.pubid = journal.pubid",
>>but that is only part of the WHERE clause. More than zero journals will
>>match the where clause. How many?

All of them.

>>TRUE OR X = ____
>>
>>X OR TRUE = ____

>
> both are TRUE.

Regardless of X. In your where clause, X is the restrict condition for books in one case and for journals in the other case.

>>>but the where clause : "where publication.pubid = book.pubid or
>>>publication.pubid = journal.pubid"
>>>will be TRUE.
>>
>>Yes, it will be true. It will be true for how many journals?

> 
> 
> It will be true for |book.pubid|.

You have that backward. It be true for |journal.pubid|. All journals will match for every book and vice versa.

>>>Similiar answer to the other part of your question.
>>
>>>Please correct me if I am wrong.
>>
>>You are focusing on single records rather than the CROSS PRODUCT, but
>>the FROM clause yields a CROSS PRODUCT.
>>
>>If you have 5 publications P, 3 books B and 2 journals J:
>>
>>P has 5 rows
>>B has 3 rows
>>J has 2 rows
>>P X B has 15 rows
>>P X J has 10 rows
>>B X J has 6 rows
>>P X B X J has 30 rows
>>
>>Let's take P X B and restrict it to rows with equal pubid (ie. replace
>>CROSS PRODUCT with NATURAL JOIN) and call the result PB.
>>
>>PB has 3 rows
>>PB X J has ____ rows ?- Hide quoted text -
>
> Since |PB| = 3 and |J|=2, |PB * J| = 6.

Exactly, which means every book appears twice. Let's take P X J and restrict it to rows with equal pubid (ie. replace CROSS PRODUCT with NATURAL JOIN) and call the result PJ.

PJ has 2 rows
PJ X B has 6 rows

The result you want has |PB| + |PJ| - |INTERSECT(B,J)| rows.

Your second solution will have
|PB X J| + |PJ X B| - |INTERSECT(PB X J,PJ X B)| rows.

Each book will be duplicated |J| times and each journal |B| times.

As you mentioned elsewhere, you can get rid of the duplicates using DISTINCT, but this only works when |B| > 0, |J| > 0 and the select list contains only columns from P. Those requirements severely limit the opportunity for generalizing the method. Received on Tue Jul 10 2007 - 22:18:04 CEST

Original text of this message