15.10

Consider the following extended relational-algebra operators. Describe how to implement each operation using sorting and using hashing.

  1. Semijoin (\(\ltimes_\theta\)): The multiset semijoin operator \(r \ltimes_\theta s\) is defined as follows: if a tuple \(r_i\) appears \(n\) times in \(r\), it appears \(n\) times in the result of \(r \ltimes_\theta s\) if there is at least one tuple \(s_j\) such that \(r_i\) and \(s_j\) satisfy predicate \(\theta\); otherwise \(r_i\) does not appear in the result.

  2. Anti-semijoin (\(\overline{\ltimes}_\theta\)): The multiset anti-semijoin operator \(r \overline{\ltimes}_\theta s\) is defined as follows: if a tuple \(r_i\) appears \(n\) times in \(r\), it appears \(n\) times in the result of \(r \overline{\ltimes}_\theta s\) if there does not exist any tuple \(s_j\) in \(s\) such that \(r_i\) and \(s_j\) satisfy predicate \(\theta\); otherwise \(r_i\) does not appear in the result.


As in the case of join algorithms, semijoin and anti-semijoin can be done efficiently if the join conditions are equijoin conditions. We describe below how to efficiently handle the case of equijoin conditions using sorting and hashing. With arbitrary join conditions, sorting and hashing cannot be used; (block) nested loop join needs to be used instead.

  1. Semijoin:

  1. Anti-semijoin: