Page 172 - DCAP208_Management Support Systems
P. 172

Unit 10: Data Mining Tools and Techniques




          Because of the origins of the techniques and because of some of their early successes the techniques  Notes
          have enjoyed a great deal of interest. To understand how neural networks can detect patterns in
          a database an analogy is often made that they “learn” to detect these patterns and make better
          predictions in a similar way to the way that human beings do. This view is encouraged by the
          way the historical training data is often supplied to the network - one record (example) at a time.
          Neural networks do “learn” in a very real sense but under the hood the algorithms and techniques
          that are being deployed are not truly different from the techniques found in statistics or other
          data mining algorithms. It is for instance, unfair to assume that neural networks could outperform
          other techniques because they “learn” and improve over time while the other techniques are
          static. The other techniques if fact “learn” from historical examples in exactly the same way but
          often times the examples (historical records) to learn from a processed all at once in a more
          efficient manner than neural networks which often modify their model one record at a time.
          A common claim for neural networks is that they are automated to a degree where the user does
          not need to know that much about how they work, or predictive modeling or even the database
          in order to use them. The implicit claim is also that most neural networks can be unleashed on
          your data straight out of the box without having to rearrange or modify the data very much to
          begin with.

          Just the opposite is often true. There are many important design decisions that need to be made
          in order to effectively use a neural network such as:

               How should the nodes in the network be connected?
               How many neuron like processing units should be used?
               When should “training” be stopped in order to avoid overfitting?
          There are also many important steps required for preprocessing the data that goes into a neural
          network - most often there is a requirement to normalize numeric data between 0.0 and 1.0 and
          categorical predictors may need to be broken up into virtual predictors that are 0 or 1 for each
          value of the original categorical predictor. And, as always, understanding what the data in your
          database means and a clear definition of the business problem to be solved are essential to
          ensuring eventual success. The bottom line is that neural networks provide no short cuts.

          Applying Neural Networks to Business

          Neural networks are very powerful predictive modeling techniques but some of the power
          comes at the expense of ease of use and ease of deployment. As we will see in this section, neural
          networks, create very complex models that are almost always impossible to fully understand
          even by experts. The model itself is represented by numeric values in a complex calculation that
          requires all of the predictor values to be in the form of a number. The output of the neural
          network is also numeric and needs to be translated if the actual prediction value is categorical
          (e.g. predicting the demand for blue, white or black jeans for a clothing manufacturer requires
          that the predictor values blue, black and white for the predictor color to be converted to numbers).
          Because of the complexity of these techniques much effort has been expended in trying to
          increase the clarity with which the model can be understood by the end user. These efforts are
          still in there infancy but are of tremendous importance since most data mining techniques
          including neural networks are being deployed against real business problems where significant
          investments are made based on the predictions from the models (e.g. consider trusting the
          predictive model from a neural network that dictates which one million customers will receive
          a $1 mailing).

          There are two ways that these shortcomings in understanding the meaning of the neural network
          model have been successfully addressed:




                                           LOVELY PROFESSIONAL UNIVERSITY                                   165
   167   168   169   170   171   172   173   174   175   176   177