Page 171 - DCAP208_Management Support Systems
P. 171
Management Support Systems
Notes sufficiently distinguish between those who are churners and those who are not - let’s say it is
40%/60%. On the other hand, there may be a series of questions that do quite a nice job in
distinguishing those cellular phone customers who will churn and those who won’t. Maybe the
series of questions would be something like: “Have you been a customer for less than a year, do
you have a telephone that is more than two years old and were you originally landed as a
customer via tele-sales rather than direct sales?” This series of questions defines a segment of the
customer population in which 90% churn. These are then relevant questions to be asking in
relation to predicting churn.
If the decision tree algorithm just continued growing the tree like this it could conceivably
create more and more questions and branches in the tree so that eventually there was only one
record in the segment. To let the tree grow to this size is both computationally expensive but
also unnecessary. Most decision tree algorithms stop growing the tree when one of three criteria
are met:
The segment contains only one record. (There is no further question that you could ask
which could further refine a segment of just one.)
All the records in the segment have identical characteristics. (There is no reason to continue
asking further questions segmentation since all the remaining records are the same.)
The improvement is not substantial enough to warrant making the split.
10.2.5 Neural Networks
When data mining algorithms are talked about these days most of the time people are talking
about either decision trees or neural networks. Of the two neural networks have probably been
of greater interest through the formative stages of data mining technology. As we will see
neural networks do have disadvantages that can be limiting in their ease of use and ease of
deployment, but they do also have some significant advantages. Foremost among these
advantages is their highly accurate predictive models that can be applied across a large number
of different types of problems.
To be more precise with the term “neural network” one might better speak of an “artificial
neural network”. True neural networks are biological systems (a k a brains) that detect patterns,
make predictions and learn. The artificial ones are computer programs implementing
sophisticated pattern detection and machine learning algorithms on a computer to build
predictive models from large historical databases. Artificial neural networks derive their name
from their historical development which started off with the premise that machines could be
made to “think” if scientists found ways to mimic the structure and functioning of the human
brain on the computer. Thus historically neural networks grew out of the community of Artificial
Intelligence rather than from the discipline of statistics. Despite the fact that scientists are still far
from understanding the human brain let alone mimicking it, neural networks that run on
computers can do some of the things that people can do.
It is difficult to say exactly when the first “neural network” on a computer was built. During
World War II a seminal paper was published by McCulloch and Pitts which first outlined the
idea that simple processing units (like the individual neurons in the human brain) could be
connected together in large networks to create a system that could solve difficult problems and
display behavior that was much more complex than the simple pieces that made it up. Since that
time much progress has been made in finding ways to apply artificial neural networks to real
world prediction problems and in improving the performance of the algorithm in general. In
many respects the greatest breakthroughs in neural networks in recent years have been in their
application to more mundane real world problems like customer response prediction or fraud
detection rather than the loftier goals that were originally set out for the techniques such as
overall human learning and computer speech and image understanding.
164 LOVELY PROFESSIONAL UNIVERSITY