Page 215 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 215
Unit 11: Query Processing and Optimization
You can define query evaluation as the query-execution engine taking a query-evaluation plan, notes
executing that plan, and returning the answers to the query. The query processing involves the
study of the following concepts:
How to measure query costs?
Algorithms for evaluating relational algebraic operations.
How to evaluate a complete expression using algorithms on individual operations?
optimisation
A relational algebra expression may have many equivalent expressions.
Example: σ (salary < 5000) (πsalary (EMP)) is equivalent to πsalary (σsalary <5000
(EMP)).
Each relational algebraic operation can be evaluated using one of the several different algorithms.
Correspondingly, a relational-algebraic expression can be evaluated in many ways.
An expression that specifies detailed evaluation strategy is known as evaluation-plan, for
example, you can use an index on salary to find employees with salary < 5000 or we can perform
complete relation scan and discard employees with salary ≥ 5000. The basis of selection of any of
the scheme will be the cost.
Query Optimisation: Amongst all equivalent plans choose the one with the lowest cost. Cost
is estimated using statistical information from the database catalogue, for example, number of
tuples in each relation, size of tuples, etc.
Thus, in query optimisation we find an evaluation plan with the lowest cost. The cost estimation
is made on the basis of heuristic rules.
11.2 Description and requirements for Data Warehouse Queries
Most data warehouse queries are designed to follow a star schema and can process hundreds
of millions of rows in a single query. By default, the query optimizer detects queries against
star schemas and creates efficient query plans for them. One method the optimizer can use to
generate an efficient plan is to use bitmap filtering. A bitmap filter uses a compact representation
of a set of values from a table in one part of the operator tree to filter rows from a second table
in another part of the tree. Essentially, the filter performs a semi-join reduction; that is, only the
rows in the second table that qualify for the join to the first table are processed.
In SQL Server 2008, bitmap filtering can be introduced in the query plan after optimization,
as in SQL Server 2005, or introduced dynamically by the query optimizer during query plan
generation. When the filter is introduced dynamically, it is referred to as an optimized bitmap
filter. Optimized bitmap filtering can significantly improve the performance of data warehouse
queries that use star schemas by removing non-qualifying rows from the fact table early in the
query plan. Without optimized bitmap filtering, all rows in the fact table are processed through
some part of the operator tree before the join operation with the dimension tables removes the
non-qualifying rows. When optimized bitmap filtering is applied, the non-qualifying rows in the
fact table are eliminated immediately.
11.2.1 understanding Bitmap filtering
The bitmap filter compares favorably to the bitmap index. A bitmap index is an alternate form
for representing row ID (RID) lists in a value-list index using one or more bit vectors indicating
which row in a table contains a certain column value. Both can be very effective in removing
unnecessary rows from result processing; however, there are important differences between a
LoveLy professionaL university 209