14.22

Suppose a relation is stored in a B+-tree file organization. Suppose secondary indices store record identifiers that are pointers to records on disk.

  1. What would be the effect on the secondary indices if a node split happened in the file organization?

  2. What would be the cost of updating all affected records in a secondary index?

  3. How does using the search key of the file organization as a logical record identifier solve this problem?

  4. What is the extra cost due to the use of such logical record identifiers?


Suppose a relation is stored in a B+-tree file organization. Suppose secondary indices store record identifiers that are pointers to records on disk.

  1. What would be the effect on the secondary indices if a node split happened in the file organization?

If a node split happens on the file organization, there needs to be a corresponding change on the secondary index. The pointers to the records, where the records are the ones that are stored inside of the nodes that splitted, need to be updated.

  1. What would be the cost of updating all affected records in a secondary index?

It depends on the type of secondary index that we are using. Suppose our secondary index is a B+-tree, let \(h\) be height of our tree and \(n\) be the number of affected records. Then the cost of updating all affected records in this secondary index would be \(O(hn)\).

  1. How does using the search key of the file organization as a logical record identifier solve this problem?

If we use the search key of the file organization as a logical record identifier, then we don’t have make changes to the secondary-index when a node splits inside of the file organization.

There is a famous quote:-

“Every problem in Computer Science can be solved by another level of indirection. 😜”

  1. What is the extra cost due to the use of such logical record identifiers?

Let \(h_1\) be height of the secondary index (assuming our secondary index is a B+-tree).

Let \(h_2\) be height of the file organization.

When we perform a lookup using the secondary index: 1. If the secondary index uses record identifiers that are pointers to records on disk: \(O(h_1)\). 2. If the secondary index uses logical record indentifiers (search keys of the file organization): \(O(h_1 + h_2)\).