Page 176 - DCAP606_BUSINESS_INTELLIGENCE
P. 176

Unit 12: Understanding Data Mining Tools




                                                                                                Notes
               !
             Caution  Here, the CustomerID column is required in all models because it is the only
            available column that can be used as the case key.

          12.2.1 Defining a Mining Structure


          Setting up a data mining structure includes the following steps:
               Defining a data source.
               Selecting columns of data to include in the structure and defining a case key.

               Define a key for the structure (including the key for the bested table, if applicable).
               Specify whether the source data should be separate into a training set and testing set or not
               (Optional).

               Process the structure.
          Data Sources for Mining Structures

          When you define a mining structure, you use columns that are available in an existing data
          source view. A data source view is a distributed object that permits you combines multiple data
          sources and uses them as a single source. The initial data sources are not evident to consumer
          applications, and you can use the properties of the data source view to modify data kinds, create
          aggregations, or alias columns.

          If you develop multiple mining models from the same mining structure, the models can use
          distinct columns from the structure.


                 Example: You can create a single structure and then build distinct decision tree and
          clustering models from it, with each form using distinct columns and forecasting distinct
          attributes.
          Also, each model can use the columns from the structure in different ways.


                 Example: Your data source view might comprise an Income column, which you can bin
          in different ways for different models.
          Mining Structure Columns


          The building blocks of the mining structure are the mining structure columns, which recount the
          data that the data source comprises. These columns comprise data such as data type, content
          type, and how the data is distributed. The mining structure does not comprise data about how
          columns are used for an exact mining model, or about the kind of algorithm that is used to build
          a model; this data is characterised in the mining model itself.
          A mining structure can also comprise nested tables. A nested table comprises a one-to-many
          relationship between the entity of a case and its associated attributes.


                 Example: If the information that recounts the customer resident in one table, and the
          customer’s purchase resides in another table, you can use nested tables to blend the information
          into a single case.




                                           LOVELY PROFESSIONAL UNIVERSITY                                   171
   171   172   173   174   175   176   177   178   179   180   181