Page 133 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 133
Unit 7: Probabilistic Reasoning
Introduction Notes
In statistics, Bayesian inference is a method of inference in which Bays’’ rule is used to update the
probability estimate for a hypothesis as additional evidence is learned. Bayesian updating is an
important technique throughout statistics, and especially in mathematical statistics. For some
cases, exhibiting a Bayesian derivation for a statistical method automatically ensures that the
method works as well as any competing method. Bayesian updating is especially important in
the dynamic analysis of a sequence of data. Bayesian inference has found application in a range
of fields including science, engineering, philosophy, medicine, and law.
In the philosophy of decision theory, Bayesian inference is closely related to discussions of
subjective probability, often called “Bayesian probability”. Bayesian probability provides a
rational method for updating beliefs; however, non-Bayesian updating rules are compatible
with rationality, according to philosophers Ian Hacking and Bas van Fraassen.
7.1 Bayesian Probabilistic Inference
Bayesian inference derives the posterior probability as a consequence of two antecedents, a
prior probability and a “likelihood function” derived from a probability model for the data to
be observed. Bayesian inference computes the posterior probability according to Bayes’ rule:
PE H ⋅ ( )
H
(| ) P
P(H|E)=
PE
()
where
| means given.
H stands for any hypothesis whose probability may be affected by data (called evidence
below). Often there are competing hypotheses, from which one chooses the most probable.
the evidence E corresponds to data that were not used in computing the prior probability.
P(H), the prior probability, is the probability of H before E is observed. This indicates
one’s preconceived beliefs about how likely different hypotheses are, absent evidence
regarding the instance under study.
P(H|E), the posterior probability, is the probability of H given E, i.e., after E is observed.
This tells us what we want to know: the probability of a hypothesis given the observed
evidence.
P(E|H), the probability of observing E given H, is also known as the likelihood. It indicates
the compatibility of the evidence with the given hypothesis.
P(E) is sometimes termed the marginal likelihood or “model evidence”. This factor is the
same for all possible hypotheses being considered. (This can be seen by the fact that the
hypothesis H does not appear anywhere in the symbol, unlike for all the other factors.)
This means that this factor does not enter into determining the relative probabilities of
different hypotheses.
Note that what affects the value of P(H|E) for different values of H is only the factors P(H) and
P(E|H), which both appear in the numerator, and hence the posterior probability is proportional
to both. In words:
(more exactly) The posterior probability of a hypothesis is determined by a combination
of the inherent likeliness of a hypothesis (the prior) and the compatibility of the observed
evidence with the hypothesis (the likelihood).
LOVELY PROFESSIONAL UNIVERSITY 127