Page 236 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 236

Unit 13: Parallel Databases




          Another important aspect of parallel execution is the re-partitioning of rows while they are sent  Notes
          from servers in one server set to another. For the query plan in figure, after a server process in
          SS1 scans a row of employees, which server process of SS2 should it send it to? The partitioning
          of rows flowing up the query tree is decided by the operator into which the rows are flowing
          into. In this case, the partitioning of rows flowing up from SS1 performing the parallel scan of
          employees into SS2 performing the parallel hash-join is done by hash partitioning on the join
          column value. That is, a server process scanning employees computes a hash function of the
          value of the column employees.employee_id to decide the number of the server process in SS2
          to send it to. The partitioning method used in parallel queries is explicitly shown in the EXPLAIN
          PLAN of the query.




             Notes  The partitioning of rows being sent between sets of execution servers should not be
             confused with Oracle’s partitioning feature whereby tables can be partitioned using hash,
             range, and other methods.

          13.6 Summary


              Parallel database machine architectures have evolved from the use of exotic hardware to
               a software parallel dataflow architecture based on conventional shared-nothing hardware.
              These new designs provide impressive speedup and scale-up when processing relational
               database queries.

          13.7 Keywords

          Horizontal Partitioning: Horizontal partitioning a fact table speed up queries without indexing,
          by minimizing the set of data to be scanned.
          Inter-query Parallelism: Inter-query parallelism is the ability  to use multiple processors  to
          execute several independent queries simultaneously.

          Intra-query  Parallelism: Intra-query parallelism is  the ability  to break  a  single query  into
          subtasks and to execute those subtasks in parallel using a different processor for each.
          OLTP: Online Transactional Processing

          Parallel Database: Parallel database system is one that seeks to improve performance through
          parallel  implementation of  various operations such as  loading data,  building indexes, and
          evaluating  queries.


          13.8 Self Assessment

          Fill in the blanks:
          1.   .............................. main architectures have been proposed for building parallel DBMSs.
          2.   MPP stands for ..................................

          3.   ............................ helps systems scale in performance by making optimal use of hardware
               resources.
          4.   ............................. parallelism does not provide speedup, because each query is still executed
               by only one processor.





                                           LOVELY PROFESSIONAL UNIVERSITY                                   229
   231   232   233   234   235   236   237   238   239   240   241