Page 262 - DCAP201_FUNDAMENTALS_OF_DATA_STRUCTURES
P. 262

Unit 14: Searching




          The algorithm is easy to remember if you think about a child’s guessing game. Imagine a  Notes
          number between 1 and 1000 and guess the number. Each time you guess, you will get to know
          “higher” or “lower.” Of course you would begin by guessing 500, then either 250 or 750 depending
          on the response.
          You would continue to refine your choice until you got the number correct.

          14.2.1 Characteristics

          Every iteration eliminates half of the remaining possibilities. This makes binary searches very
          efficient – even for large collections. Our implementation relies on recursion, though it is
          equally as common to see an iterative approach.
          Binary search requires a sorted collection. This means the collection must either be sorted before
          searching, or inserts/updates must be smart. Also, binary searching can only be applied to a
          collection that allows random access (indexing).
          In the real world, binary searching is frequently used thanks to its performance characteristics
          over large collections. The only time binary searching doesn’t make sense is when the collection
          is being frequently updated (relative to searches), since resorting will be required.
          Hash tables can often provide better (though somewhat unreliable) performance. Typically, it’s
          relatively clear when data belongs in a hash table (for frequent lookups) versus when a search is
          needed.

          14.2.2 Analysis

                                        n
          A binary search on an array is O(log ) because at each test you can “throw out” one half of the
                                       2
          search space. If we assume n, the number of items to search, is a power of two (i.e. n = 2x) then,
          given that n is cut in half at each comparison, the most comparisons needed to find a single item
          in n is x.




             Notes  It is noteworthy that for very small arrays a linear search can prove faster than a
            binary search. However as the size of the array to be searched increases the binary search
            is the clear victor in terms of number of comparisons and therefore overall speed.

          Still, the binary search has some drawbacks. First of all, it requires that the data to be searched
          be in sorted order. If there is even one element out of order in the data being searched it can
          throw off the entire process.

               !
             Caution  When presented with a set of unsorted data the efficient programmer must decide
             whether to sort the data and apply a binary search or simply apply the less-efficient linear
             search.
          Even the best sorting algorithm is a complicated process. Is the cost of sorting the data is worth
          the increase in search speed gained with the binary search? If you are searching only once, it is
          probably better to do a linear search in most cases.
          Once the data is sorted it can also prove very expensive to add or delete items. In a sorted array,
          for instance, such operations require a ripple-shift of array elements to open or close a “hole” in
          the array.




                                           LOVELY PROFESSIONAL UNIVERSITY                                   255
   257   258   259   260   261   262   263   264   265   266   267