Page 163 - DCAP208_Management Support Systems
P. 163
Management Support Systems
Notes Self Assessment
Fill in the blanks:
1. Data mining ........................ collect data and model the data to represent the reality.
2. ........................ is used to monitor the data changes, information contained in the database
and onscreen update.
3. ........................ tool has the ability to mine data in various kind of text such as Microsoft
words and acrobat PDF.
4. ........................ is the process of implementing the extracted patterns to determine differences
or non-standardized data.
10.2 Data Mining Techniques
These techniques have been divided into classical techniques and next generation techniques.
This division is based on when the data mining technique was developed and when it became
technically mature enough to be used for business, especially for aiding in the optimization of
customer relationship management systems.
Classical Techniques: Statistics, Neighborhoods and Clustering are the techniques that
have classically been used for decades the next section represents techniques that have
only been widely used since the early 1980s. The main techniques that we will discuss here
are the ones that are used 99.9% of the time on existing business problems. There are
certainly many other ones as well as proprietary techniques from particular vendors – but
in general the industry is converging to those techniques that work consistently and are
understandable and explainable.
Next Generation Techniques: Trees, Networks and Rules are data mining techniques
represent the most often used techniques that have been developed over the last two
decades of research. They also represent the vast majority of the techniques that are being
spoken about when data mining is mentioned in the popular press. These techniques can
be used for either discovering new information within large databases or for building
predictive models. Though the older decision tree techniques such as CHAID are currently
highly used the new techniques such as CART are gaining wider acceptance.
10.2.1 Statistics
By strict definition “statistics” or statistical techniques are not data mining. They were being
used long before the term data mining was coined to apply to business applications. However,
statistical techniques are driven by the data and are used to discover patterns and build predictive
models. And from the users perspective you will be faced with a conscious choice when solving
a “data mining” problem as to whether you wish to attack it with statistical methods or other
data mining techniques.
Statistics is a branch of mathematics concerning the collection and the description of data.
Usually statistics is considered to be one of those scary topics in college right up there with
chemistry and physics. However, statistics is probably a much friendlier branch of mathematics
because it really can be used every day. Statistics was in fact born from very humble beginnings
of real world problems from business, biology, and gambling!
Knowing statistics in your everyday life will help the average business person make better
decisions by allowing them to figure out risk and uncertainty when all the facts either aren’t
known or can’t be collected. Even with all the data stored in the largest of data warehouses
156 LOVELY PROFESSIONAL UNIVERSITY