The Way the Query Works
For every cap row (x), we test the following. For every
cap row (y), we find out if the cap (z) exists in all rows for every
language. From the innermost query, if there is a result (a "match"), some
row is SELECTed and then the innermost query is TRUE. Every time the
innermost query is TRUE, the innermost query tells the middle query FALSE
(because it contains "WHERE NOT EXISTS"). If the innermost query fails
(i.e., it is FALSE), then the middle query sees TRUE, which is passed back
to the outermost query as FALSE. To succeed (i.e., to SELECT a "language"),
the query has to have this configuration:
-
SELECT a row in cap (x) (outer query).
-
For that row, begin SELECTing each row again in
cap (y) (middle query).
-
For each of the middle query rows, you want the
inner query (cap z) to be TRUE for all cases of the middle query —
remember that TRUE is translated to FALSE by the NOT EXISTS. As each
inner query is satisfied (it is TRUE), it forces the middle query to
continue looking for a match — to look at all cases and eventually
conclude FALSE (evaluate to FALSE overall). If the middle query is
FALSE, the outer query sees TRUE because of its NOT EXISTS.
To make the middle query (y) find FALSE, all of the inner
query (z) occurrences must be TRUE (i.e., the languages from the outer query
must exist with all names from the middle one (y) in the inner one (z)). For
an eventual "match," every row in the middle query for an outer query row
must be FALSE (i.e., every row in the inner query is TRUE).
Example for the query (What language is spoken by all
students?):
SELECT name, langu
FROM cap x
WHERE NOT EXISTS
(SELECT 'X'
FROM
cap y
WHERE
NOT EXISTS
(SELECT 'X'
FROM cap z
WHERE x.langu = z. langu
AND y.name = z.name))
ORDER BY langu;
Suppose we had this table:
NAME
LANGUAGE
Joe
Spanish
Mary Spanish
Mary French
-
The tuple <Joe, Spanish> is SELECTed by the outer
query (x).
-
The tuple <Joe, Spanish> is SELECTed by the middle
query (y).
-
The tuple <Joe, Spanish> is SELECTed by the inner
query (z).
-
The inner query is TRUE:
X.LANGUAGE = Spanish
Z.LANGUAGE = Spanish
Y.NAME = Joe
Z.NAME = Joe
-
Because the inner query is TRUE, the NOT EXISTS of
the middle query translates this to FALSE and continues with the next
row in the middle query. The middle query SELECTs <Mary, Spanish> and
the inner query begins again with <Joe, Spanish> seeing:
X.LANGUAGE = Spanish
Z.LANGUAGE = Spanish
Y.NAME = Mary
Z.NAME = Joe
This is FALSE, so the
inner query SELECTs a second row <Mary, Spanish>:
X.LANGUAGE = Spanish
Z.LANGUAGE = Spanish
Y.NAME = Mary
Z.NAME = Mary
This is TRUE, so the
inner query is TRUE. (Notice that the X.LANGUAGE has not changed yet; the
outer query (X) is still on the first row.)
-
Because the inner query is TRUE, the NOT EXISTS of
the middle query translates this to FALSE and continues with the next
row in the middle query. The middle query now SELECTs <Mary, French> and
the inner query begins again with <Joe, Spanish> seeing:
X.LANGUAGE = Spanish
Z.LANGUAGE = Spanish
Y.NAME = Mary
Z.NAME = Joe
This is FALSE, so the
inner query SELECTs a second row <Mary, Spanish>:
X.LANGUAGE = Spanish
Z.LANGUAGE = Spanish
Y.NAME = Mary
Z.NAME = Mary
This is TRUE, so the
inner query is TRUE.
-
Because the inner query is TRUE, the NOT EXISTS of
the middle query again converts this TRUE to FALSE and wants to
continue, but the middle query is out of tuples. This means that the
middle query is FALSE.
-
Because the middle query is FALSE, and because we
are testing
"SELECT distinct
name, language
FROM cap x
WHERE NOT EXISTS
(SELECT 'X' FROM cap y ...",
the FALSE from the
middle query is translated to TRUE for the outer query and the tuple <Joe,Spanish>
is SELECTed for the result set. Note that "Spanish" occurs with both "Joe"
and "Mary."
-
The second row in the outer query will repeat the
steps from above for <Mary, Spanish>. The value "Spanish" will be seen
to occur with both "Joe" and "Mary" as <Mary, Spanish> is added to the
result set.
-
The third tuple in the outer query begins with
<Mary, French>. The middle query SELECTs <Joe, Spanish> and the inner
query SELECTs <Joe, Spanish>. Inner query sees:
X.LANGUAGE = French
Z.LANGUAGE = Spanish
Y.NAME = Joe
Z.NAME = Mary
This is FALSE, so the
inner query SELECTs a second row, <Mary, Spanish>:
X.LANGUAGE = French
Z.LANGUAGE = Spanish
Y.NAME = Joe
Z.NAME = Mary
This is FALSE, so the
inner query SELECTs a third row, <Mary, French>:
X.LANGUAGE = French
Z.LANGUAGE = French
Y.NAME = Joe
Z.NAME = Mary
This is also FALSE.
The inner query fails. The inner query evaluates to FALSE, which causes the
middle query to see TRUE because of the NOT EXISTS. Because the middle query
sees TRUE, it is finished, evaluated to TRUE. And because the middle query
evaluates to TRUE, the NOT EXISTS in the outer query changes this to FALSE
and "X.LANGUAGE = French" fails. It fails because X.LANGUAGE = French did
not occur with all values of NAME.
The tip-off of what a query of this kind means can be
found in the inner query where the outer query is tested. In the above, you
will find a phrase that says, WHERE x.langu = z. langu … . The x.langu is
where the query is testing for all names to occur with a given language. The
query is asking, "What language is spoken by all students?" Put another way,
"What language occurs for all names?"
Note that the above query is completely different from
the following one, which asks, "Which students speak all languages?":
SELECT distinct name, langu
FROM cap x
WHERE NOT EXISTS
(SELECT 'X'
FROM
cap y
WHERE
NOT EXISTS
(SELECT 'X'
FROM cap z
WHERE y.langu = z.langu
AND
x.name = z.name))
ORDER BY langu;
which produces the following response:
NAME
LANGU
---------- ----------
MARY
FRENCH
MARY
SPANISH
2 Rows Displayed ***
Here, note that the inner query contains x.name, which
means the question was "Which names occur for all languages?"; or put
another way, "Which students speak all languages?". The "all" goes with
languages. In general, then, if a table T has attributes (A, B) and you ask
the question, "Is there a value of B that occurs for all A?", the query will
be:
SELECT DISTINCT A,B
FROM T X
WHERE NOT EXISTS
(SELECT 'X'
FROM
T Y
WHERE
NOT EXISTS
(SELECT 'X'
FROM T Z
WHERE
X.B = Z.B
AND Y.A = Z.A) ;
[ORDER BY is optional]