Page 176 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS
P. 176

Unit 8: B-trees




          search trees, but may waste some space, since nodes are not entirely full. The lower and upper   Notes

          bounds on the number of child nodes are typically fixed for a particular implementation.
          A B-tree is kept balanced by requiring that all leaf nodes are at the same depth. This depth will
          increase slowly as elements are added to the tree, but an increase in the overall depth is infrequent,
          and results in all leaf nodes being one more hop further removed from the root.
                               Figure 8.1: Schema showing the B-tree Structure
                                                 r
                                                     k
                                            C        1
                                                               C
                                             1
                                                                2
                      r.c                                 r.c
                        1                                   2
                          k     k    k                         k     k
                           1     2    3                         1     2
                 C       C          C       C 4        C 1      C           C 3
                   1      2          3                           2


              k   k    k   k    k   k   k    k  k    k   k   k   k   k   k   k   k
               1   2    1   2    1   2   1    2  3    1   2   1   2   3  4    1   2

          B-trees are balanced trees that are optimized for situations when part or the entire tree must be
          maintained in secondary storage such as a magnetic disk. Since disk accesses are expensive (time
          consuming) operations, a b-tree tries to minimize the number of disk accesses.

                Example: A b-tree with a height of 2 and a branching factor of 1001 can store over one
          billion keys but requires at most two disk accesses to search for any node.

          8.2 Structure of B-trees

          Unlike a binary-tree, each node of a b-tree may have a variable number of keys and children.
          The keys are stored in non-decreasing order. Each key has an associated child that is the root of a
          subtree containing all nodes with keys less than or equal to the key but greater than the preceding
          key. A node also has an additional rightmost child that is the root for a subtree containing all keys
          greater than any keys in the node.
          A b-tree has a minimum number of allowable children for each node known as the minimization
          factor. If t is this minimization factor, every node must have at least t – 1 keys. Under certain
          circumstances, the root node is allowed to violate this property by having fewer than t – 1 keys.
          Every node may have at most 2t – 1 keys or, equivalently, 2t children.
          Since each node tends to have a large branching factor (a large number of children), it is typically
          necessary to traverse relatively few nodes before locating the desired key. If access to each
          node requires a disk access, then a b-tree will minimize the number of disk accesses required.
          The minimization factor is usually chosen so that the total size of each node corresponds to a

          multiple of the block size of the underlying storage device. This choice simplifies and optimizes
          disk access. Consequently, a b-tree is an ideal data structure for situations where all data cannot
          reside in primary storage and accesses to secondary storage are comparatively expensive (or time
          consuming).








                                           LOVELY PROFESSIONAL UNIVERSITY                                   171
   171   172   173   174   175   176   177   178   179   180   181