Page 156 - DCAP606_BUSINESS_INTELLIGENCE
P. 156

Unit 11: Data Mining




          consuming to resolve. In this unit, you will learn about data mining approaches, uses and its  Notes
          related issues. Also, applications of data mining will be discussed. As the unit progress, you will
          learn about data mining models – predictive, summary, network and association. Finally, data
          mining algorithms basics will be introduced.

          11.1 Data Mining

          Data mining is the practice of automatically searching large stores of data to discover patterns
          and trends that go beyond simple analysis. It uses sophisticated mathematical algorithms to
          segment the data and evaluate the probability of future events. Some key terms to know before
          going further detail in Data Mining.

          Data

          Data are any facts, numbers, or text that can be processed by a computer. Today, organizations
          are accumulating vast and growing amounts of data in different formats and different databases.




             Did u know? This includes operational or transactional data (such as, sales, cost, inventory,
             payroll, and accounting), non-operational data (such as industry sales, forecast data etc.)
             and metadata i.e. data about data.

          Information

          The patterns, associations, or relationships among all types of data can provide information.


                 Example: Analysis of retail point of sale transaction data can yield information on which
          products are selling and when.

          Knowledge

          Information can be converted into knowledge.


                 Example: Summary information on supermarket sales can be analysed in view of
          promotional efforts to provide knowledge of consumer buying behaviour.

          11.1.1 Process of Knowledge Discovery

          Let us have an overview of the steps one by one:
          1.   Data cleaning: It refers to removal noise and inconsistent data.

          2.   Data integration: In this step, multiple data sources may be combined.
          3.   Data selection: In this step, data relevant to the analysis task are retrieved from the
               database.

          4.   Data transformation:  In this step, data is transformed or consolidated into forms
               appropriate for mining by performing summary or aggregation operations.
          5.   Data mining: This is an essential process where intelligent methods are applied in order
               to extract data patterns.





                                           LOVELY PROFESSIONAL UNIVERSITY                                   151
   151   152   153   154   155   156   157   158   159   160   161