Page 227 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 227
Database Management Systems/Managing Database
Notes Architectures of Parallel Database
The basic idea behind parallel databases is to carry out evaluation steps in parallel whenever
possible, in order to improve performance. There are many opportunities for parallelism in a
DBMS; databases represent one of the most successful instances of parallel computing.
Figure 13.1: Physical Architectures for Parallel Database Systems
Three main architectures have been proposed for building parallel DBMSs. In a shared-memory
system, multiple CPUs are attached to an interconnection network and can access a common
region of main memory. In a shared-disk system, each CPU has a private memory and direct
access to all disks through an interconnection network.
In a shared-nothing system, each CPU has local main memory and disk space, but no two CPUs
can access the same storage area; all communication between CPUs is through a network
connection. The three architectures are illustrated in Figure 13.1.
The shared memory architecture is closer to a conventional machine, and many commercial
database systems have been ported to shared memory platforms with relative ease.
Communication overheads are low, because main memory can be used for this purpose, and
operating system services can be leveraged to utilize the additional CPUs.
Although this approach is attractive for achieving moderate parallelism a few tens of CPUs can
be exploited in this fashion memory contention becomes a bottleneck as the number of CPUs
increases. The shared-disk architecture faces a similar problem because large amounts of data
are shipped through the interconnection network.
The basic problem with the shared-memory and shared-disk architectures is interference:
As more CPUs are added, existing CPUs are slowed down because of the increased contention
for memory accesses and network bandwidth. It has been noted that even an average 1 percent
slowdown per additional CPU means that the maximum speedup is a factor of 37, and adding
additional CPUs actually slows down the system; a system with 1,000 CPUs is only 4 percent as
effective as a single CPU system! This observation has motivated the development of the
shared-nothing architecture, which is now widely considered to be the best architecture for
large parallel database systems.
The shared-nothing architecture requires more extensive reorganization of the DBMS code, but
it has been shown to provide linear speed-up, in that the time taken for operations decreases in
proportion to the increase in the number of CPUs and disks, and linear scale-up, in that
performance is sustained if the number of CPUs and disks are increased in proportion to the
amount of data. Consequently, ever-more powerful parallel database systems can be built by
taking advantage of rapidly improving performance for single CPU systems and connecting as
many CPUs as desired.
220 LOVELY PROFESSIONAL UNIVERSITY