Page 244 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS
P. 244
Unit 12: Sorting
second_list = x[1], x[6], x[11], x[16]…….x[96] Notes
third_list = x[2], x[7], x[12], x[17]…….x[97]
forth_list = x[3], x[8], x[13], x[18]…….x[98]
fifth_list = x[4], x[9], x[14], x[19] …….x[99]
th
So the i sub list will contain every K element of the original list starting from index i-1.
th
According to the algorithm mentioned above, for each iteration, the list is divided and then
sorted. If we use the same value of K, we will get the same sub lists and every time we will sort
the same sub lists, which will not result in the ordered final list. Note that sorting the fi ve sub
lists independently do not ensure that the full list is sorted! So we need to change the value of
K (increase or decrease?) for every iteration. To know whether the array is sorted, we need to
scan the full list. We also know that number of sub lists we get are equal to the value of K. So if
we decide to reduce the value of K after every iteration we will reduce the number of sub lists
also in every iteration. Eventually, when K will be set to 1, we will have only one sub list. Hence
we know the termination condition for our algorithm is K = 1. Since for every iteration we are
decreasing the value of the increment (K) the algorithm is also known as “diminishing increment
sort”.
Any sorting algorithm or a combination of algorithms can be used for sorting the sub lists, e.g.
some shell sort implementation use bubble sort for the last iteration and insertion sort for other
iterations. We use insertion sort in this tutorial. How does shell sort give better performance if
the sorting is ultimately done by algorithms like insertion sort and bubble sort? The rationale is
as follows.
If the list is either small or almost sorted then insertion sort is efficient since less number of
elements will be shifting. In Shell Sort as we have already seen, the original list is divided into
smaller lists based on the value of increment and sorted.
Initially value of K is fairly large giving a large number of small sub lists. We know that simple
sorting algorithms like insertion sort are effective for small lists, since the data movements are
over short distances. As K reduces in value, the length of the sub lists increases. But since the
earlier sub lists have been sorted, we except the full list to look more and more sorted. Again note
that algorithms such as insertion sort are fairly efficient when they work with nearly sorted lists.
Thus the inefficiency arising out of working with larger lists is partly compensated by lists being
increasingly sorted. This is the intuitive explanation for the performance of shell sort.
How to choose the value of increment K?
Till date there is no convincing answer to what should be the optimal sequence for the increment,
K. But it has been empirically found that larger number of increments gives more effi cient results.
Also it is advisable not to use empirical sequences like 1,2,4,8… or 1,3,6,9… if we do so, for
different iteration, we may get almost the same elements in the sub lists to compare and sort
which will not result in better performance. For example, consider a list X= 12 6 2 5 8 10 1 15 31
23 and increment sequences 6 and 3.
When the increment is 6, you will get the following sub lists
(12,1)
(6,15)
(2,31)
(5,23)
( 8)
(10)
LOVELY PROFESSIONAL UNIVERSITY 239