Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining and Data WarehousingData Warehousing-Cubing Algorithms Imagination While computing data cubes, we came across a concept of iceberg cubes, which satisfy the minimum threshold for materializing a data cuboid. Iceberg cubes are the cubes which have only those cuboids which have at least a minimum of 'k' support, where 'k' is a threshold. All cuboids of support less than 'k' are pruned, thereby reducing the size of data cube. This process is done to reduce the size of data cube without losing out on much of the information. But, is there another use of having a threshold on support? Is there some kind of a pattern in the data with high support? If so, how can we find such patterns and where is it used? Insights Cube computation is a memory intensive operation. Thus, algorithms for computing cubes should be memory efficient and also intelligently use the precomputed values to avoid re-computation of redundant parts of data cube. Cubing algorithms usually follow bottom-up or top-down approach for computing cubes. Bottom-up cubing algorithms use the base cuboid and perform aggregations on different attributes to generate higher level cuboids, thereby requiring only cuboids of previous level for computing higher level cuboids. Topdown cubing algorithms, on the other hand, start from apex cuboid and use iceberg conditions to avoid constructing cuboids of support less than a threshold, thereby, avoiding useless computation. There are some hybrid cubing algorithms which combine both top-down and bottom-up approaches for efficient cube computation. Glossary Iceberg cube: A cube which consists of cells which satisfy a certain Apriori condition. Materialization: The methodology of precomputing cube cells before applying the cube construction algorithms. BUC: A recursive bottom up method for computing the ROLAP data cube. Multi array aggregation: A chunking based method for computing the MOLAP data cube. Also referred to as Top down cubing. Star cubing: An algorithm which integrates the advantages of both top down and bottom up cubing. Resources Iceberg cube and Definitions PPT (For your convenience you can get them inside Learn More Quadrant) Cubing Heuristics PPT (For your convenience you can get them inside Learn More Quadrant) Materialization and Cubing algorithms PPT (For your convenience you can get them inside Learn More Quadrant) JIT lecture on Multi array cubing PPT (For your convenience you can get them inside Learn More Quadrant) References http://www.olapreport.com http://www.olapreport.com/Market.htm http://www.bvicam.ac.in/news/INDIACom%202011/292.pdf http://www.vldb.org/conf/2003/papers/S15P02.pdf http://slidewiki.org/deck/1564_star-cubing#tree-0-deck-1564-1-view http://cs.uiuc.edu/class/fa05/cs412/chaps/4.pdf http://www2.cs.uregina.ca/~dbd/cs831/notes/dcubes/iceberg.html Bache, K. &Lichman, M. (2013). UCI Machine Learning Repository [1]. Irvine, CA: University of California, School of Information and Computer Science.