Page 148 - DCAP208_Management Support Systems
P. 148
Unit 9: Data Mining
looking for a sunken Spanish galleon on the high seas the first thing you might do is to research Notes
the times when Spanish treasure had been found by others in the past.
Notes You might note that these ships often tend to be found off the coast of Bermuda and
that there are certain characteristics to the ocean currents, and certain routes that have
likely been taken by the ship’s captains in that era. You note these similarities and build a
model that includes the characteristics that are common to the locations of these sunken
treasures.
With these models in hand you sail off looking for treasure where your model indicates it most
likely might be given a similar situation in the past. Hopefully, if you’ve got a good model, you
find your treasure.
This act of model building is thus something that people have been doing for a long time,
certainly before the advent of computers or data mining technology. What happens on computers,
however, is not much different than the way people build models. Computers are loaded up
with lots of information about a variety of situations where an answer is known and then the
data mining software on the computer must run through that data and distill the characteristics
of the data that should go into the model. Once the model is built it can then be used in similar
situations where you don’t know the answer.
Example: Say that you are the director of marketing for a telecommunications company
and you’d like to acquire some new long distance phone customers. You could just randomly go
out and mail coupons to the general population – just as you could randomly sail the seas
looking for sunken treasure. In neither case would you achieve the results you desired and of
course you have the opportunity to do much better than random – you could use your business
experience stored in your database to build a model.
As the marketing director you have access to a lot of information about all of your customers: their
age, sex, credit history and long distance calling usage. The good news is that you also have a lot
of information about your prospective customers: their age, sex, credit history etc. Your problem
is that you don’t know the long distance calling usage of these prospects (since they are most likely
now customers of your competition). You’d like to concentrate on those prospects who have large
amounts of long distance usage. You can accomplish this by building a model. Table 9.1 illustrates
the data used for building a model for new customer prospecting in a data warehouse.
Table 9.1: Data Mining for Prospecting
Customers Prospects
General information (e.g. demographic data) Known Known
Proprietary information (e.g. customer transactions) Known Target
The goal in prospecting is to make some calculated guesses about the information in the lower
right hand quadrant based on the model that we build going from Customer General Information
to Customer Proprietary Information. For instance, a simple model for a telecommunications
company might be:
98% of my customers who make more than $60,000/year spend more than $80/month on long
distance.
This model could then be applied to the prospect data to try to tell something about the proprietary
information that this telecommunications company does not currently have access to. With this
model in hand new customers can be selectively targeted.
LOVELY PROFESSIONAL UNIVERSITY 141