Page 176 - DCAP606_BUSINESS_INTELLIGENCE
P. 176
Unit 12: Understanding Data Mining Tools
Notes
!
Caution Here, the CustomerID column is required in all models because it is the only
available column that can be used as the case key.
12.2.1 Defining a Mining Structure
Setting up a data mining structure includes the following steps:
Defining a data source.
Selecting columns of data to include in the structure and defining a case key.
Define a key for the structure (including the key for the bested table, if applicable).
Specify whether the source data should be separate into a training set and testing set or not
(Optional).
Process the structure.
Data Sources for Mining Structures
When you define a mining structure, you use columns that are available in an existing data
source view. A data source view is a distributed object that permits you combines multiple data
sources and uses them as a single source. The initial data sources are not evident to consumer
applications, and you can use the properties of the data source view to modify data kinds, create
aggregations, or alias columns.
If you develop multiple mining models from the same mining structure, the models can use
distinct columns from the structure.
Example: You can create a single structure and then build distinct decision tree and
clustering models from it, with each form using distinct columns and forecasting distinct
attributes.
Also, each model can use the columns from the structure in different ways.
Example: Your data source view might comprise an Income column, which you can bin
in different ways for different models.
Mining Structure Columns
The building blocks of the mining structure are the mining structure columns, which recount the
data that the data source comprises. These columns comprise data such as data type, content
type, and how the data is distributed. The mining structure does not comprise data about how
columns are used for an exact mining model, or about the kind of algorithm that is used to build
a model; this data is characterised in the mining model itself.
A mining structure can also comprise nested tables. A nested table comprises a one-to-many
relationship between the entity of a case and its associated attributes.
Example: If the information that recounts the customer resident in one table, and the
customer’s purchase resides in another table, you can use nested tables to blend the information
into a single case.
LOVELY PROFESSIONAL UNIVERSITY 171