Page 253 - DCAP605_ADVANCED_DATA_STRUCTURE_AND

Page 253 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS

P. 253

Advanced Data Structure and Algorithms

Notes }

}
void msort(int x[], int n)
{
int 1;
int y[];
1 = 1;
while(1 < n)
{
mpass(x, y, l, n);
1 = 1 * 2;
mpass(y, x, l, n);
1 = 1 * 2;
}
}

The merging of two sub-lists, the ﬁrst running from the index 1 to m, and the second running
from the index (m+1) to n requires no more than (n-1+1) iterations. Hence if 1 = 1, then no more
than n iterations, where n is the size of the list to be sorted. Therefore if n is the size of the list to be
sorted, every pass that a merge routine performs requires a time proportional to O(n), and since
the number of passes required to be performed are log n. The time complexity of the algorithm is
2
O(nlog (n)), both average and worst case. The merge sort requires an additional list of size n.
2
Multi-way Merge Sort

The two-phase, multiway merge-sort algorithm is similar to the external-memory merge-sort
algorithm presented in the previous section. Phase 1 is the same, but, in phase 2, the main loop is
performed only once merging all [N/M] runs into one run in one go. To achieve this, multiway
merging is performed instead of using the TWOWAY-MERGE algorithm.

The idea of multiway merging is the same as for the two-way merging, but instead of having 2
input buffers (Bf and Bf ) of B elements, we have [N/M] input buffers, each B elements long.
1
2
Each buffer corresponds to one unﬁnished (or active) run. Initially, all runs are active. Each buffer

has a pointer to the ﬁrst unchosen element in that buffer (analogous to p and p in TWOWAY-

2
1
MERGE).
The multiway merging is performed by repeating these steps:
1. Find the smallest element among the unchosen elements of all the input buffers. Linear
search is sufﬁcient, but if the CPU cost is also important, minimum priority queue can be

used to store pointers to all the unchosen elements in input buffers. In such a case, ﬁ nding
the smallest element is logarithmic in the number of the active runs.
2. Move the smallest element to the ﬁrst available position of the output buffer.

3. If the output buffer is full, write it to the disk and reinitialize the buffer to hold the next
output page.
4. If the buffer, from which the smallest element was just taken is now exhausted of elements,
read the next page from the corresponding run. If no pages remain in that run, consider the

run ﬁnished (no longer active).

248 LOVELY PROFESSIONAL UNIVERSITY

248 249 250 251 252 253 254 255 256 257 258