Page 29 - DCAP603_DATAWARE_HOUSING_AND_DATAMINING
P. 29

Unit 2: Data Mining Concept




          For many years, statistics have been used to analyze data in an effort to find correlations, patterns,   notes
          and dependencies. However, with an increased in technology more and more data are available,
          which  greatly  exceed  the  human  capacity  to  manually  analyze  them.  Before  the  1990’s,  data
          collected by bankers, credit card companies, department stores and so on have little used. But
          in recent years, as computational power increases, the idea of data mining has emerged. Data
          mining is a term used to describe the “process of discovering patterns and trends in large data
          sets in order to find useful decision-making information.” With data mining, the information
          obtained from the bankers, credit card companies, and department stores can be put to good
          use
                                     figure 2.1: Data Mining chart
























          2.4 architecture of Data Mining

          Based on the above definition, the architecture of a typical data mining system may have the
          following major components (Figure 2.2):

          1.   Information repository: This is one or a set of databases, data warehouses, spreadsheets, or
               other kinds of information repositories. Data cleaning and data integration techniques may
               be performed on the data.

          2.   Database or data warehouse server: The database or data warehouse server is responsible
               for fetching the relevant data, based on the user’s data mining request.
          3.   Knowledge base: This is the domain knowledge that is used to guide the search or evaluate
               the interestingness of resulting patterns.
          4.   Data mining engine: This is essential to the data mining system and ideally consists of a set of
               functional modules for tasks such as characterisation, association and correlation analysis,
               classification, prediction, cluster analysis, outlier analysis, and evolution analysis.
          5.   Pattern evaluation module: This component typically employs interestingness measures
               and interacts with the data mining modules so as to focus the search toward interesting
               patterns.














                                           LoveLy professionaL university                                    23
   24   25   26   27   28   29   30   31   32   33   34