Page 207 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 207

Database Management Systems/Managing Database




                    Notes              Linear search can be applied regardless of selection condition or ordering of records in the
                                       file, or availability of indices.
                                   2.  Binary search: It is applicable when the selection is an equality comparison on the attribute
                                       on which file is ordered. Assume that the blocks of a relation are stored continuously then,
                                       cost can be estimated as:

                                       Cost = Cost of locating the first tuple by a binary search on the blocks + Sequence of other
                                       blocks that continue to satisfy the condition.

                                                        Average number of tuples with the same value
                                       = [log (N )] +
                                            2   b  Blocking factor (Number of tuples in a block) of the relation
                                       These two values may be calculated from the statistics of the database.

                                   12.2.2 Index Scan


                                   Search algorithms that use an index are restricted because the selection condition must be on the
                                   search-key of the index.
                                   1.  (a)  Primary index-scan for equality: This search retrieves a single record that satisfies
                                            the corresponding equality condition. The cost here can be calculated as:
                                            Cost = Height traversed in index to locate the block pointer +1(block of the primary
                                            key is transferred for access).

                                       (b)  Hash key: It retrieves a single record in a direct way thus, cost in hash key may also
                                            be considered as Block transfer needed for finding hash target +1
                                   2.  Primary index-scan for comparison: Assuming that the relation is sorted on the attribute(s)
                                       that are being compared, (< , > etc.), then we need to locate the first record satisfying the
                                       condition after which the records are scanned forward or backward as the condition may
                                       be, displaying all the records. Thus cost in this case would be:
                                       Cost = Number of block transfer to locate the value in index + Transferring all the blocks
                                       of data satisfying that condition.




                                     Notes  We can calculate roughly (from the cost point of view) the number of blocks satisfying
                                     the condition as:
                                     Number of values that satisfy the condition × Average number of  tuples per attribute
                                     value/Blocking factor of the relation.
                                   3.  Equality on clustering index to retrieve multiple records: The cost calculations in this case
                                       are somewhat similar to that of algorithm (primary index-scan for comparison).
                                       (a)  Equality on search-key of secondary index: Retrieves a single record if the search-key is
                                            a candidate key.
                                            Cost = Cost of accessing index + 1.
                                            It retrieves multiple records if search-key is not a candidate key.
                                            Cost =  Cost  of accessing index  +  Number  of  records retrieved  (It can be  very
                                            expensive).
                                            Each record may be on a different block, thus, requiring one block access for each
                                            retrieved record (this is the worst case cost).


          200                               LOVELY PROFESSIONAL UNIVERSITY
   202   203   204   205   206   207   208   209   210   211   212