Page 156 - DCAP606_BUSINESS_INTELLIGENCE
P. 156
Unit 11: Data Mining
consuming to resolve. In this unit, you will learn about data mining approaches, uses and its Notes
related issues. Also, applications of data mining will be discussed. As the unit progress, you will
learn about data mining models – predictive, summary, network and association. Finally, data
mining algorithms basics will be introduced.
11.1 Data Mining
Data mining is the practice of automatically searching large stores of data to discover patterns
and trends that go beyond simple analysis. It uses sophisticated mathematical algorithms to
segment the data and evaluate the probability of future events. Some key terms to know before
going further detail in Data Mining.
Data
Data are any facts, numbers, or text that can be processed by a computer. Today, organizations
are accumulating vast and growing amounts of data in different formats and different databases.
Did u know? This includes operational or transactional data (such as, sales, cost, inventory,
payroll, and accounting), non-operational data (such as industry sales, forecast data etc.)
and metadata i.e. data about data.
Information
The patterns, associations, or relationships among all types of data can provide information.
Example: Analysis of retail point of sale transaction data can yield information on which
products are selling and when.
Knowledge
Information can be converted into knowledge.
Example: Summary information on supermarket sales can be analysed in view of
promotional efforts to provide knowledge of consumer buying behaviour.
11.1.1 Process of Knowledge Discovery
Let us have an overview of the steps one by one:
1. Data cleaning: It refers to removal noise and inconsistent data.
2. Data integration: In this step, multiple data sources may be combined.
3. Data selection: In this step, data relevant to the analysis task are retrieved from the
database.
4. Data transformation: In this step, data is transformed or consolidated into forms
appropriate for mining by performing summary or aggregation operations.
5. Data mining: This is an essential process where intelligent methods are applied in order
to extract data patterns.
LOVELY PROFESSIONAL UNIVERSITY 151