2.9

The division operator of relational algebra, “\(\div\)” is defined as follows. Let \(r(R)\) and \(s(S)\) be relations, and let \(S \subseteq R\); that is, every attribute of schema \(S\) is also in schema \(R\). Given a tuple \(t\), let \(t[S]\) denote the projection of tuple \(t\) on the attributes in \(S\). Then \(r \div s\) is a relation on schema \(R - S\) (that is, on the schema containing all attributes of schema \(R\) that are not in schema \(S\)). A tuple \(t\) is in \(r \div s\) if and only if both of two conditions hold: * \(t\) is in \(\Pi_{R-S}(r)\) * For every tuple \(t_s\) in \(s\), there is a tuple \(t_r\) in \(r\) satisfying both of the following:
a. \(t_r[S] = t_s[S]\)
b. \(t_r[R - S] = t\)

Given the above definition:

  1. Write a relational algebra expression using the division operator to find the IDs of all students who have taken all Comp. Sci. courses. (Hint: project takes to just ID and course_id, and generate the set of all Comp. Sci. course_id s using a select expression, before doing the division.)
  2. Show how to write the above query in relational algebra, without using division. (By doing so, you would have shown how to define the division operation using the other relational algebra operations.)

  1. \(\Pi_{ID,course\_id}(takes) \div \Pi_{course\_id}(\sigma_{dept\_name = 'Comp. Sci'}(course))\)
  2. The required expression is as follows:

\[ \begin{aligned} r \leftarrow \Pi_{ID,course\_id}(takes) \\\\ s \leftarrow \Pi_{course\_id}(\sigma_{dept\_name = 'Comp. Sci'}(course)) \\\\ \Pi_{ID}(takes) - \Pi_{ID}((\Pi_{ID}(takes) \times s) - r) \end{aligned} \]

In general, let \(r(R)\) and \(s(S)\) be given, with \(S \subseteq R\). Then we can express the division operation using basic relational algebra operations as follows: \[ r \div s = \Pi_{R-S}(r) - \Pi_{R-S}((\Pi_{R-S}(r) \times s) - \Pi_{R-S, S}(r)) \]

To see that this expression is true, we observe that \(\Pi_{R-S}(r)\) gives us all tuples \(t\) that satisfy the first condition of the definition of division. The expression on the right side of the set difference operator $ {R-S}(({R-S}(r) s) - _{R-S, S}(r)) $ serves to eliminate those tuples that fail to satisfy the second condition of the definition of division. Let us see how it does so. Consider \(\Pi_{R-S}(r) \times s\). This relation is on schema \(R\), and pairs every tuple in \(\Pi_{R-S}(r)\) with every tuple in \(s\). The expression \(\Pi_{R-S,S}(r)\) merely reorders the attributes of \(r\).

Thus, \((\Pi_{R-S}(r) \times s) - \Pi_{R-S, S}(r)\) gives us those pairs of tuples from \(\Pi_{R-S}(r)\) and \(s\) that do not appear in \(r\). If a tuple \(t_j\) is in $ {R-S}(({R-S}(r) s) - _{R-S, S}(r)) $ then there is some tuple \(t_s\) in \(s\) that does not combine with tuple \(t_j\) to form a tuple in \(r\). Thus, \(t_j\) holds a value for attributes \(R - S\) that does not appear in \(r \div s\). It is these values that we eliminate from \(\Pi_{R-S}(r)\).