Page 253 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS
P. 253
Advanced Data Structure and Algorithms
Notes }
}
void msort(int x[], int n)
{
int 1;
int y[];
1 = 1;
while(1 < n)
{
mpass(x, y, l, n);
1 = 1 * 2;
mpass(y, x, l, n);
1 = 1 * 2;
}
}
The merging of two sub-lists, the first running from the index 1 to m, and the second running
from the index (m+1) to n requires no more than (n-1+1) iterations. Hence if 1 = 1, then no more
than n iterations, where n is the size of the list to be sorted. Therefore if n is the size of the list to be
sorted, every pass that a merge routine performs requires a time proportional to O(n), and since
the number of passes required to be performed are log n. The time complexity of the algorithm is
2
O(nlog (n)), both average and worst case. The merge sort requires an additional list of size n.
2
Multi-way Merge Sort
The two-phase, multiway merge-sort algorithm is similar to the external-memory merge-sort
algorithm presented in the previous section. Phase 1 is the same, but, in phase 2, the main loop is
performed only once merging all [N/M] runs into one run in one go. To achieve this, multiway
merging is performed instead of using the TWOWAY-MERGE algorithm.
The idea of multiway merging is the same as for the two-way merging, but instead of having 2
input buffers (Bf and Bf ) of B elements, we have [N/M] input buffers, each B elements long.
1
2
Each buffer corresponds to one unfinished (or active) run. Initially, all runs are active. Each buffer
has a pointer to the first unchosen element in that buffer (analogous to p and p in TWOWAY-
2
1
MERGE).
The multiway merging is performed by repeating these steps:
1. Find the smallest element among the unchosen elements of all the input buffers. Linear
search is sufficient, but if the CPU cost is also important, minimum priority queue can be
used to store pointers to all the unchosen elements in input buffers. In such a case, fi nding
the smallest element is logarithmic in the number of the active runs.
2. Move the smallest element to the first available position of the output buffer.
3. If the output buffer is full, write it to the disk and reinitialize the buffer to hold the next
output page.
4. If the buffer, from which the smallest element was just taken is now exhausted of elements,
read the next page from the corresponding run. If no pages remain in that run, consider the
run finished (no longer active).
248 LOVELY PROFESSIONAL UNIVERSITY