Page 265 - DCAP605_ADVANCED_DATA_STRUCTURE_AND_ALGORITHMS
P. 265

Advanced Data Structure and Algorithms




                    Notes          Therefore, the probability follows the binomial distribution, which has

                                   Mean:       E[n ] = np = 1
                                             i
                                   Variance:  Var[n ] = np(1 – p) = 1 – 1/n
                                                i
                                   For any random variable, we have
                                                           2
                                            E[ n ] =  Var[n ] +  E [n ]
                                             2
                                               i       i     i
                                                =  1 – 1/n + 1 2
                                                =  2 – 1n
                                                = Θ(1)
                                   Putting this value in equation A above, (do some tweaking) and we have a expected time for
                                   INSERTION_SORT, O(n).

                                   Now back to our original problem
                                   In the above Bucket sort algorithm, we observe
                                   T(n) = [Time to insert n elements in array A] + [Time to go through auxiliary array B[0 . . n-1] *
                                   (Sort by INSERTION_SORT)
                                   = O(n) + (n-1)    (n)
                                   = O(n)

                                   Therefore, the entire Bucket sort algorithm runs in linear expected time.



                                       Task    Discuss bubble sort with suitable example.


                                   12.9 External Sorting


                                   External sorting refers to the sorting of a file that is on disk (or tape). Internal sorting refers to the

                                   sorting of an array of data that is in RAM. The main concern with external sorting is to minimize
                                   disk access since reading a disk block takes about a million times longer than accessing an item
                                   in RAM (according to Shaffer – see the reference at the end of this document).
                                   Perhaps the simplest form of external sorting is to use a fast internal sort with good locality of
                                   reference (which means that it tends to reference nearby items, not widely scattered items) and
                                   hope that your operating system’s virtual memory can handle it. (Quicksort is one sort algorithm

                                   that is generally very fast and has good locality of reference.) If the file is too huge, however, even
                                   virtual memory might be unable to fit it. Also, the performance may not be too great due to the

                                   large amount of time it takes to access data on disk.
                                   Methods

                                   Most external sort routines are based on mergesort. They typically break a large data fi le into
                                   a number of shorter, sorted “runs”. These can be produced by repeatedly reading a section of

                                   the data file into RAM, sorting it with ordinary quicksort, and writing the sorted data to disk.
                                   After the sorted runs have been generated, a merge algorithm is used to combine sorted fi les into

                                   longer sorted files. The simplest scheme is to use a 2-way merge: merge 2 sorted files into one


                                   sorted file, then merge 2 more, and so on until there is just one large sorted file. A better scheme

                                   is a multiway merge algorithm: it might merge perhaps 128 shorter runs together.



          260                              LOVELY PROFESSIONAL UNIVERSITY
   260   261   262   263   264   265   266   267   268   269   270