Page 207 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 207
Database Management Systems/Managing Database
Notes Linear search can be applied regardless of selection condition or ordering of records in the
file, or availability of indices.
2. Binary search: It is applicable when the selection is an equality comparison on the attribute
on which file is ordered. Assume that the blocks of a relation are stored continuously then,
cost can be estimated as:
Cost = Cost of locating the first tuple by a binary search on the blocks + Sequence of other
blocks that continue to satisfy the condition.
Average number of tuples with the same value
= [log (N )] +
2 b Blocking factor (Number of tuples in a block) of the relation
These two values may be calculated from the statistics of the database.
12.2.2 Index Scan
Search algorithms that use an index are restricted because the selection condition must be on the
search-key of the index.
1. (a) Primary index-scan for equality: This search retrieves a single record that satisfies
the corresponding equality condition. The cost here can be calculated as:
Cost = Height traversed in index to locate the block pointer +1(block of the primary
key is transferred for access).
(b) Hash key: It retrieves a single record in a direct way thus, cost in hash key may also
be considered as Block transfer needed for finding hash target +1
2. Primary index-scan for comparison: Assuming that the relation is sorted on the attribute(s)
that are being compared, (< , > etc.), then we need to locate the first record satisfying the
condition after which the records are scanned forward or backward as the condition may
be, displaying all the records. Thus cost in this case would be:
Cost = Number of block transfer to locate the value in index + Transferring all the blocks
of data satisfying that condition.
Notes We can calculate roughly (from the cost point of view) the number of blocks satisfying
the condition as:
Number of values that satisfy the condition × Average number of tuples per attribute
value/Blocking factor of the relation.
3. Equality on clustering index to retrieve multiple records: The cost calculations in this case
are somewhat similar to that of algorithm (primary index-scan for comparison).
(a) Equality on search-key of secondary index: Retrieves a single record if the search-key is
a candidate key.
Cost = Cost of accessing index + 1.
It retrieves multiple records if search-key is not a candidate key.
Cost = Cost of accessing index + Number of records retrieved (It can be very
expensive).
Each record may be on a different block, thus, requiring one block access for each
retrieved record (this is the worst case cost).
200 LOVELY PROFESSIONAL UNIVERSITY