Page 218 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 218
Unit 12: Query Processing and Optimization
Pipelining Notes
Evaluate operations in a multi-threaded manner, (i.e., passes tuples output from one operation
to the next parent operation as input) even as the first operation is being executed. In the
previous expression tree, it does not store (materialise) results instead, it passes tuples directly
to the join. Similarly, does not store results of join, and passes tuples directly to the projection.
Thus, there is no need to store a temporary relation on a disk for each operation. Pipelining may
not always be possible or easy for sort, hash-join.
One of the pipelining execution methods may involve a buffer filled by the result tuples of
lower level operation while, records may be picked up from the buffer by the higher level
operation.
Complex Joins
When an expression involves three relations then we have more than one strategy for the
evaluation of the expression. For example, join of relations such as:
STUDENT MARKS SUBJECTS
may involve the following three strategies:
Strategy 1: Compute STUDENT MARKS; use result to compute result SUBJECTS
Strategy 2: Compute MARKS SUBJECTS first, and then join the result with STUDENT
Strategy 3: Perform the pair of joins at the same time. This can be done by building an index of
enroll no in STUDENT and on subject code in SUBJECTS.
For each tuple m in MARKS, look up the corresponding tuples in STUDENT and the corresponding
tuples in SUBJECTS. Each tuple of MARKS will be examined only once.
Strategy 3 combines two operations into one special-purpose operation that may be more
efficient than implementing the joins of two relations.
12.6 Creation of Query Evaluation Plans
Generation of query-evaluation plans for an expression involves several steps:
1. Generating logically equivalent expressions using equivalence rules
2. Annotating resultant expressions to get alternative query plans
3. Choosing the cheapest plan based on estimated cost.
The overall process is called cost based optimisation.
The cost difference between a good and a bad method of evaluating a query would be enormous.
We would therefore, need to estimate the cost of operations and statistical information about
relations. For example a number of tuples, a number of distinct values for an attribute etc.
Statistics helps in estimating intermediate results to compute the cost of complex expressions.
12.7 Transformation of Relational Expressions
Two relational algebraic expressions are said to be equivalent if on every legal database instance
the two expressions generate the same set of tuples (order of tuples is irrelevant).
LOVELY PROFESSIONAL UNIVERSITY 211