Page 176 - DCAP402_DCAO204_DATABASE MANAGEMENT SYSTEM_MANAGING DATABASE
P. 176
Unit 10: Datalog and Recursion
RECURSIVE keyword signals that the table (in our example, Components) is recursively defined. Notes
The structure of the definition closely parallels the Datalog rules. Incidentally, if we wanted to
find the components of a particular part, for example, trike, we can simply replace the last line
with the following:
SELECT * FROM Components C2
WHERE C2.Part = ‘trike’
10.2 Evaluation of Datalog Program
We classify the relations in a Datalog program as either output relations or input relations.
Output relations are defined by rules (e.g., Components), and input relations have a set of tuples
explicitly listed (e.g., Assembly). Given instances of the input relations, we must compute instances
for the output relations. The meaning of a Datalog program is usually defined in two different
ways, both of which essentially describe the relation instances for the output relations. Technically,
a query is a selection over one of the output relations (e.g., all Components tuples C with
C.part = trike). However, the meaning of a query is clear once we understand how relation
instances are associated with the output relations in a Datalog program.
The first approach to defining what a Datalog program means is called the least model semantics
and gives users a way to understand the program without thinking about how the program is to
be executed. That is, the semantics is declarative, like the semantics of relational calculus, and
not operational like relational algebra semantics. This is important because the presence of
recursive rules makes it difficult to understand a program in terms of an evaluation strategy.
The second approach, called the least fixpoint semantics, gives a conceptual evaluation strategy
to compute the desired relation instances. This serves as the basis for recursive query evaluation
in a DBMS. More efficient evaluation strategies are used in an actual implementation, but their
correctness is shown by demonstrating their equivalence to the least fixpoint approach. The
fixpoint semantics is thus operational and plays a role analogous to that of relational algebra
semantics for non-recursive queries.
Least Model Semantics
We want users to be able to understand a Datalog program by understanding each rule
independently of other rules, with the meaning: If the body is true, the head is also true. This
intuitive reading of a rule suggests that given certain relation instances for the relation names
that appear in the body of a rule, the relation instance for the relation mentioned in the head of
the rule must contain a certain set of tuples. If a relation name R appears in the heads of several
rules, the relation instance for R must satisfy the intuitive reading of all these rules. However,
we do not want tuples to be included in the instance for R unless they are necessary to satisfy one
of the rules defining R. That is, we only want to compute tuples for R that are supported by some
rule for R.
To make these ideas precise, we need to introduce the concepts of models and least models. A
model is a collection of relation instances, one instance for each relation in the program, that
satisfies the following condition. For every rule in the program, whenever we replace each
variable in the rule by a corresponding constant, the following holds:
1. If every tuple in the body (obtained by our replacement of variables with constants) is in
the corresponding relation instance,
2. Then the tuple generated for the head (by the assignment of constants to variables that
appear in the head) is also in the corresponding relation instance.
LOVELY PROFESSIONAL UNIVERSITY 169