Page 28 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 28

Data Warehousing and Data Mining




                    notes          knowledge can be applied to decision-making, process control, information management, and
                                   query processing. Therefore, data mining is considered one of the most important frontiers in
                                   database and information systems and one of the most promising interdisciplinary developments
                                   in the information technology.
                                   2.2 What is Data Mining?


                                   In simple words, data mining refers to extracting or “mining” knowledge from large amounts of
                                   data. Some other terms like knowledge mining from data, knowledge extraction, data/pattern
                                   analysis, data archaeology, and data dredging are also used for data mining. Many people treat
                                   data mining as a synonym for another popularly used term, Knowledge Discovery from Data,
                                   or KDD.
                                   Some people view data mining as simply an essential step in the process of knowledge discovery.
                                   Knowledge discovery as a process and consists of an iterative sequence of the following steps:

                                   1.   Data cleaning (to remove noise and inconsistent data)
                                   2.   Data integration (where multiple data sources may be combined)
                                   3.   Data selection (where data relevant to the analysis task are retrieved from the database)
                                   4.   Data transformation (where data are transformed or consolidated into forms appropriate
                                       for mining by performing summary or aggregation operations, for instance)
                                   5.   Data mining (an essential process where intelligent methods are applied in order to extract
                                       data patterns)

                                   6.   Pattern  evaluation  (to  identify  the  truly  interesting  patterns  representing  knowledge
                                       based on some interestingness measures)
                                   7.   Knowledge presentation (where visualisation and knowledge representation techniques
                                       are used to present the mined knowledge to the user).
                                   The first four steps are different forms of data preprocessing, which are used for data preparation
                                   for mining. After this the data-mining step may interact with the user or a knowledge base.
                                   The interesting patterns are presented to the user and may be stored as new knowledge in the
                                   knowledge base.

                                   2.3 Definition of Data Mining


                                   Today, in industry, in media, and in the database research milieu, the term data mining is becoming
                                   more popular than the longer term of knowledge discovery from data. Therefore in a broader
                                   view of data mining functionality data mining can be defined as “the process of discovering
                                   interesting knowledge from large amounts of data stored in databases, data warehouses, or other
                                   information repositories.”





















          22                               LoveLy professionaL university
   23   24   25   26   27   28   29   30   31   32   33