Page 227 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 227

Database Management Systems/Managing Database




                    Notes          Architectures of Parallel Database

                                   The basic idea behind parallel databases is to carry out evaluation steps in parallel whenever
                                   possible, in order to improve performance. There are many opportunities for parallelism in a
                                   DBMS; databases represent one of the most successful instances of parallel computing.
                                               Figure  13.1:  Physical  Architectures for  Parallel  Database  Systems


















                                   Three main architectures have been proposed for building parallel DBMSs. In a shared-memory
                                   system, multiple CPUs are attached to an interconnection network and can access a common
                                   region of main memory. In a shared-disk system, each CPU has a private memory and direct
                                   access to all disks through an interconnection network.

                                   In a shared-nothing system, each CPU has local main memory and disk space, but no two CPUs
                                   can access the  same  storage  area; all communication between CPUs  is through  a  network
                                   connection. The three architectures are illustrated in Figure 13.1.
                                   The shared  memory architecture is closer to a conventional machine,  and many commercial
                                   database  systems  have  been  ported  to  shared  memory  platforms  with  relative  ease.
                                   Communication overheads are low, because main memory can be used for this purpose, and
                                   operating system services can be leveraged to utilize the additional CPUs.

                                   Although this approach is attractive for achieving moderate parallelism a few tens of CPUs can
                                   be exploited in this fashion memory contention becomes a bottleneck as the number of CPUs
                                   increases. The shared-disk architecture faces a similar problem because large amounts of data
                                   are shipped through the interconnection network.
                                   The basic problem with the shared-memory and shared-disk architectures is interference:

                                   As more CPUs are added, existing CPUs are slowed down because of the increased contention
                                   for memory accesses and network bandwidth. It has been noted that even an average 1 percent
                                   slowdown per additional CPU means that the maximum speedup is a factor of 37, and adding
                                   additional CPUs actually slows down the system; a system with 1,000 CPUs is only 4 percent as
                                   effective as a  single CPU  system! This  observation has  motivated the development of  the
                                   shared-nothing architecture, which is  now widely considered to  be the  best architecture for
                                   large parallel database systems.
                                   The shared-nothing architecture requires more extensive reorganization of the DBMS code, but
                                   it has been shown to provide linear speed-up, in that the time taken for operations decreases in
                                   proportion  to the  increase in  the number  of CPUs  and  disks,  and linear  scale-up,  in  that
                                   performance is sustained if the number of CPUs and disks are increased in proportion to the
                                   amount of data. Consequently, ever-more powerful parallel database systems can be built by
                                   taking advantage of rapidly improving performance for single CPU systems and connecting as
                                   many CPUs as desired.




          220                               LOVELY PROFESSIONAL UNIVERSITY
   222   223   224   225   226   227   228   229   230   231   232