Page 189 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS
P. 189
Advanced Data Structure and Algorithms
Notes 2. Concurrent Access to B-trees: Databases typically run in multiuser environments where
many users can concurrently perform operations on the database.
Unfortunately, this common scenario introduces complications.
Example: Imagine a database storing bank account balances. Now assume that
someone attempts to withdraw `40 from an account containing `60. First, the current
balance is checked to ensure sufficient funds. After funds are disbursed, the balance of
the account is reduced. This approach works flawlessly until concurrent transactions are
considered. Suppose that another person simultaneously attempts to withdraw `30 from
the same account. At the same time the account balance is checked by the first person, the
account balance is also retrieved for the second person. Since neither person is requesting
more funds than are currently available, both requests are satisfied for a total of `70. After
the first person’s transaction, `20 should remain (`60 - `40), so the new balance is recorded
as `20. Next, the account balance after the second person’s transaction, `30 (`60 – `30), is
recorded overwriting the `20 balance. Unfortunately, `70 have been disbursed, but the
account balance has only been decreased by `30. Clearly, this behavior is undesirable, and
special precautions must be taken.
A b-tree suffers from similar problems in a multiuser environment. If two or more processes
are manipulating the same tree, it is possible for the tree to become corrupt and result in
data loss or errors.
The simplest solution is to serialize access to the data structure. In other words, if another process
is using the tree, all other processes must wait.
Although this is feasible in many cases, it can place a unnecessary and costly limit on
performance because many operations actually can be performed concurrently without risk.
Locking, introduced by Gray and refined by many others, provides a mechanism for controlling
concurrent operations on data structures in order to prevent undesirable side effects and to
ensure consistency.
8.9 Summary
B-trees are balanced trees that are optimized for situations when part or the entire tree
must be maintained in secondary storage such as a magnetic disk.
A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each
node may contain a large number of keys. The number of subtrees of each node, then, may
also be large.
A B-tree is designed to branch out in this large number of directions and to contain a lot of
keys in each node so that the height of the tree is relatively small.
This means that only a small number of nodes must be read from disk to retrieve an item.
The goal is to get fast access to the data, and with disk drives this means reading a very
small number of records. Note that a large node size (with lots of keys in the node) also fi ts
with the fact that with a disk drive one can usually read a fair amount of data at once.
8.10 Keywords
B-Tree Algorithms: A B-tree is a data structure that maintains an ordered set of data and allows
efficient operations to find, delete, insert, and browse the data.
184 LOVELY PROFESSIONAL UNIVERSITY