Page 247 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS
P. 247

Advanced Data Structure and Algorithms




                    Notes          the h  form a geometric series, the sum of all h  with k = 1, ..., t is in O(ht) = O( n). Thus O(n⋅n)
                                       k                                k
                                   sorting steps are needed for this part where k t.
                                   Now let k > t. When the sequence is arranged as an array with h  columns there are n/h  elements
                                                                                                       k
                                                                                    k
                                   in each column. Thus, O((n/h )2) sorting steps are needed to sort each column, since Insertion
                                                           k
                                   Sort has quadratic complexity. There are h  columns, therefore the number of sorting steps for h -
                                                                                                              k
                                                                    k
                                   sorting the entire data sequence is in O((n/h )2⋅h ) = O(n⋅n/h ). Again, the n/h  form a geometric
                                                                     k
                                                                                                 k
                                                                         k
                                                                                   k
                                   series whose sum is in O(n/ht) = O( n). Therefore, again O(n⋅n) steps are needed for this part
                                   where k > t.
                                   It can be shown that for this h-sequence the upper bound is tight. But there is another h-sequence
                                   that leads to a more efficient behavior of Shellsort.

                                   Theorem:  With the h-sequence 1, 2, 3, 4, 6, 8, 9, 12, 16, ..., 2p3q, ... Shellsort needs O(n⋅log(n)2)
                                   steps for sorting a sequence of length n  (Pratt [Pra 79]).
                                   Proof:  If g = 2 and h = 3, then γ(g, h)  =  (g-1)⋅(h-1) – 1  =  1, i.e. in a 2,3-sorted sequence to the right

                                   of each element only the next element can be smaller. Therefore, O(n) sorting steps suffice to sort
                                   the sequence with Insertion Sort. Considering elements with odd and with even index separately,
                                   it becomes clear that again O(n) sorting steps suffice to make a 4,6-sorted sequence 2-sorted.

                                   Similarly, O(n) sorting steps suffice to make a 6,9-sorted sequence 3-sorted and so on.

                                   The above h-sequence has the property that for each h  also 2h  and 3h  occurs, so O(n) sorting
                                                                              k      k      k
                                   steps suffice for each h . Altogether there are log(n)2 elements in the h-sequence; thus the

                                                      k
                                   complexity of Shellsort with this h-sequence is in O(n⋅log(n)2).
                                   The h-sequence of Pratt performs best asymptotically, but it consists of log(n)2 elements.
                                   Particularly, if the data sequence is presorted, a h-sequence with less elements is better, since the
                                   data sequence has to be scanned (by the for-i-loop in the program) for each h , even if only a few
                                                                                               k
                                   sorting steps are performed.
                                   By combining the arguments of these two theorems h-sequences with O(log(n)) elements can be
                                   derived that lead to a very good performance in practice, as for instance the h-sequence of the
                                   program. But unfortunately, there seems to be no h-sequence that gives Shellsort a worst case
                                   performance of O(n⋅log(n)). It is an open question whether possibly the average complexity is in
                                   O(n⋅log(n)).
                                   12.4 Heap Sort


                                   Heapsort is a sorting technique which sorts a contiguous list of length n with O(nlog2(n)
                                   comparisons and movement of entries, even in the worst case. Hence it achieves worst-case
                                   bounds better than those of quicksort, and for contiguous list it is better than mergesort, since it
                                   needs only a small and constant amount of space apart from the list being sorted.
                                   Heapsort proceeds in two phases. First, all the entries in the list are arranged to satisfy heap
                                   property, and then top of the heap is removed and another entry is promoted to take its place
                                   repeatedly. Therefore we need a procedure which builds an initial heap to arrange all the entries
                                   in the list to satisfy heap property. The procedure which builds an initial heap uses a procedure
                                   which adjust the ith entry in the list whose entries at 2i and 2i + 1 positions already satisfy heap
                                   property in such a manner entry at ith position in the list will also satisfy heap property.
                                   The following C code implements the algorithm.
                                   void adjust(list x, int i, int n)
                                   {
                                      int j, k;
                                      Boolean flag;




          242                              LOVELY PROFESSIONAL UNIVERSITY
   242   243   244   245   246   247   248   249   250   251   252