Page 200 - DCAP310_INTRODUCTION_TO_ARTIFICIAL_INTELLIGENCE_AND_EXPERT_SYSTEMS
P. 200
Introduction to Artificial Intelligence & Expert Systems
Notes Update all relevant structures in the basis algorithm.
Perform the rest of needed steps required in the basis algorithm.
Notice definition of sequence y:
y = [x ... x ].
i-n i
If x or other particles of the sequence are not available, there are some options to handle this
i-n
situation:
Output x uncoded and skip to the next symbol
i
Add shortened symbol to the alphabet.
The first option is obvious and do not need farther clarification. The second option can be
enhanced and should be explained more detailed. I will not touch this option directly because it
can be implemented as variant of Multilevel PPM.
Self Assessment
State whether the following statements are true or false:
10. Partial matching (PPM) is an adaptive statistical data compression technique based on
context modeling and prediction.
11. Predictions are increased to symbol rankings.
12. The symbol size is usually static.
10.5 Fuzzy Matching Algorithms
Fuzzy matching is a mathematical process used to determine how similar one piece of data is to
another.
Example: If you had ‘John Arun Smith’ in one system and wanted to find similar names
from another system, the fuzzy matching process may return a list like this:
J A Smith
John A Smith
JA Smith
Jon Smith
Jonathan A Smith
Jon A Smithe
For each entry examined, the fuzzy matching process can give a probability score as to the
accuracy of the match. For example, ‘John A Smith’ might receive a 95% score of similarity
whereas ‘JA Smith’ would only get a 72% score.
Fuzzy matching attempts to improve recall by being less strict but without sacrificing relevance.
With fuzzy matching the algorithm is designed to find documents containing terms related to
the terms used in the query. The assumption is that related words (in the English language) are
likely to have the same core and differ at the beginning and/or end. A search for “matching”, for
example, would also return documents containing match, matched, etc. Unfortunately, it will
also return documents containing unrelated words like matchbox, etc.
194 LOVELY PROFESSIONAL UNIVERSITY