Page 174 - DCAP208_Management Support Systems

P. 174

Unit 10: Data Mining Tools and Techniques

examination it turns out that the distributor was delivering product to but not collecting payment Notes
from one of their customers.
A sale on men’s suits is being held in all branches of a department store for southern California.
All stores with these characteristics have seen at least a 100% jump in revenue since the start of
the sale except one. It turns out that this store had, unlike the others, advertised via radio rather
than television.

Neural Networks for Feature Extraction

One of the important problems in all of data mining is that of determining which predictors are
the most relevant and the most important in building models that are most accurate at prediction.
These predictors may be used by themselves or they may be used in conjunction with other
predictors to form “features”.

Example: A simple example of a feature in problems that neural networks are working on is the
feature of a vertical line in a computer image.
The predictors, or raw input data are just the colored pixels that make up the picture. Recognizing
that the predictors (pixels) can be organized in such a way as to create lines, and then using the
line as the input predictor can prove to dramatically improve the accuracy of the model and
decrease the time to create it.
Some features like lines in computer images are things that humans are already pretty good at
detecting, in other problem domains it is more difficult to recognize the features.
One novel way that neural networks have been used to detect features is the idea that features
are sort of a compression of the training database. For instance you could describe an image to
a friend by rattling off the color and intensity of each pixel on every point in the picture or you
could describe it at a higher level in terms of lines, circles – or maybe even at a higher level of
features such as trees, mountains etc. In either case your friend eventually gets all the information
that they need in order to know what the picture looks like, but certainly describing it in terms
of high level features requires much less communication of information than the “paint by
numbers” approach of describing the color on each square millimeter of the image.

If we think of features in this way, as an efficient way to communicate our data, then neural
networks can be used to automatically extract them. The neural network shown in Figure 10.3 is
used to extract features by requiring the network to learn to recreate the input data at the output
nodes by using just 5 hidden nodes. Consider that if you were allowed 100 hidden nodes, that
recreating the data for the network would be rather trivial - simply pass the input node value
directly through the corresponding hidden node and on to the output node. But as there are fewer
and fewer hidden nodes, that information has to be passed through the hidden layer in a more and
more efficient manner since there are less hidden nodes to help pass along the information.

Figure 10.3: Neural Networks can be Used for Data Compression and Feature Extraction

LOVELY PROFESSIONAL UNIVERSITY 167

169 170 171 172 173 174 175 176 177 178 179