Download Architectures of Data Mining Systems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Architectures of Data Mining Systems
Database and data warehouse systems have becomes the mainstream information
systems Comprehensive information processing and data analysis infrastructures have
been systematically constructed surrounding database systems and data warehouses.
.
Data Mining using following coupling schemes
 No coupling
 Loose coupling
 Semi-tight coupling
 Tight coupling
No Coupling: No coupling means that a DM system will not utilize any function of a
DB or DW system. It may fetch data from a particular source (such as a file system),
Process data using some DM algorithms and then store the mining results in another
files. It is A Simple in implementation.
Disadvantages:
No coupling DM system may spend a substantial amount of time finding, collecting,
cleaning and transforming data where as in DB/DW systems, data tend to be well
organized, indexed, cleaned, integrated, so that the finding task – relevant, highly quality
data becomes easy.
There are many tested, scalable algorithms and data structures implemented in DB/DW
systems. Without any coupling of such systems, a DM system will need to used other
tools, (making it difficult to integrable such systems into an info processing environment
_ Hence, NoCoupling represents a poor design.
Loose coupling:
DM system uses some facilities of a DB/DW systems It fetches data from a depositing
managed by these systems, performs data mining, and then stores the results either in a
file or in a designated place in DB/DW.
Advantages: It is better than no coupling, since it fetches any portion of data stored in
DB/DW using query processing, indexing, and other system facilities.
It incurs
flexibility, efficiency provided by DW/DB systems.
Disadvantages: It is difficult to achieve high scalability and good performance with large
data sets, since, Loose coupling DM systems are Memory based and It does not explore
data structures and query optimization methods provided by DB/DW systems
Semi – tight coupling:
Besides facilities of DB/DW systems availed by loose coupling, it also uses efficient
implementation of few DM primitives such Sorting, indexing, aggregation, histogram
analysis, multiway join, and pre-computation of some essential stastical measures such as
sum, count etc. Also, some frequently used intermediate mining results can be precomputed and stored in DB/DW system.
Tight coupling:
Here, DM system is smoothly integrated into the DB/DW system. DM subsystem is
treated as one functional component of an information system Data mining queries and
functions are optimized based on mining query analysis, data structures, indexing
schemes and query processing methods of a DB/DW systems
Advantages: This approach is
highly desirable architecture. It facilitates efficient
implementation of data mining functions. It provides high system performance. It
provides integrated information processing environment.