Page 214 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 214

Unit 12: Query Processing and Optimization




          12.4.6 Hash-join                                                                      Notes

          This is applicable to both the equi-joins and natural joins. A hash function h is used to partition
          tuples of both relations, where  h maps joining attribute (enroll no in our example) values to
          {0, 1, ..., n-1}.
          The join attribute is hashed to the join-hash partitions. In the example of  Figure 12.4 we have
          used mod 100 function to hashing, and n = 100.
          Error!

                                   Figure  12.4:  A Hash-join  Example




























          Once the partition tables of STUDENT and MARKS are made on the enrolment number, then
          only the corresponding partitions will participate in the join as:

          A STUDENT tuple and a MARKS tuple that satisfy the join condition will have the same value for
          the join attributes. Therefore, they will be hashed to equivalent partition and thus can be joined
          easily.

          Algorithm for Hash-join

          The hash-join of two relations r and s is computed as follows:
          1.   Partition the relation r and s using hashing function h. (When partitioning a relation, one
               block of memory is reserved as the output buffer for each partition).
          2.   For each partition si of s, load the partition into memory and build an in-memory hash
               index on the join attribute.

          3.   Read the tuples in ri from the disk, one block at a time. For each tuple in ri locate each
               matching tuple in si using the in-memory hash index and output the concatenation of their
               attributes.

          In this method, the relation s is called the build relation and r is called the probe relation. The
          value n (the number of partitions) and the hash function h is chosen in such a manner that each
          si should fit in to the  memory. Typically  n  is  chosen as Number of  blocks  of s/Number of




                                           LOVELY PROFESSIONAL UNIVERSITY                                   207
   209   210   211   212   213   214   215   216   217   218   219