Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
UNIT Marks . What is datawarehouse A data warehouse is a subjectoriented, integrated, timevariant, and nonvolatile collection of data in support of managements decisionmaking process . What is the significant use of subject oriented datawarehouse Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing. Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process. . Why do we use integrated version of datawarehouse Constructed by integrating multiple, heterogeneous data sources relational databases, flat files, online transaction records Data cleaning and data integration techniques are applied. Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources When data is moved to the warehouse, it is converted. . What is the role of time variant feature in Datawarehouse The time horizon for the data warehouse is significantly longer than that of operational systems. Operational database current value data. Data warehouse data provide information from a historical perspective e.g., past years Every key structure in the data warehouse Contains an element of time, explicitly or implicitly But the key of operational data may or may not contain time element. . What is meant by non volatile nature in datawarehouse A physically separate store of data transformed from the operational environment. Operational update of data does not occur in the data warehouse environment. . State the difference between datawarehouse vs operational DBMS. Traditional heterogeneous DB integration Build wrappers/mediators on top of heterogeneous databases Query driven approach dictionary is used to translate the query into queries appropriate for individual heterogeneous sites involved, and the results are integrated into a global answer set Data warehouse updatedriven, high performance Information from heterogeneous sources is integrated in advance and stored in warehouses for direct query and analysis . List the distinct features of OLTP with OLAP. Distinct features OLTP vs. OLAP User and system orientation customer vs. market Data contents current, detailed vs. historical, consolidated Database design ER application vs. star subject View current, local vs. evolutionary, integrated Access patterns update vs. readonly but complex queries . Why we need separate datawarehouse Different functions and different data missing data Decision support requires historical data which operational DBs do not typically maintain data consolidation DS requires consolidation aggregation, summarization of data from heterogeneous sources data quality different sources typically use inconsistent data representations, codes and formats which have to be reconciled . Give the conceptual modeling of datawarehouse. Modeling data warehouses dimensions amp measures Star schema A fact table in the middle connected to a set of dimension tables Snowflake schema A refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to snowflake Fact constellations Multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema or fact constellation . Define the distributive measure of datawarehouse categories. distributive if the result derived by applying the function to n aggregate values is the same as that derived by applying the function on all the data without partitioning. . Define the algebraic measure of datawarehouse categories. algebraic if it can be computed by an algebraic function with M arguments where M is a bounded integer, each of which is obtained by applying a distributive aggregate function. . Define the holistic measure of datawarehouse categories. holistic if there is no constant bound on the storage size needed to describe a subaggregate. . List the OLAP operations and their functionality Roll up drillup summarize data by climbing up hierarchy or by dimension reduction stored. What are the different types of datawarehouse design process Four views regarding the design of a data warehouse a. sees the perspectives of data in the warehouse from the view of enduser . Enterprise warehouse e. collects all of the information about subjects spanning the entire organization . Topdown view i. Data warehouse view i. exposes the information being captured. D to series of D planes. Data source view i.Drill down roll down reverse of rollup from higher level summary to lower level summary or detailed data. or introducing new dimensions Slice and dice project and select Pivot rotate reorient the cube. Business query view i. Other operations drill across involving across more than one fact table drill through through the bottom level of the cube to its backend relational tables using SQL . allows selection of the relevant information necessary for the data warehouse b. consists of fact tables and dimension tables d. Define enterprise warehouse. visualization. and managed by operational systems c. Only some of the possible summary views may be materialized . What is meant by datamart a subset of corporatewide data that is of value to a specific groups of users. What are the backend tools and utilities of Datawarehouse Data extraction h. and reporting using crosstabs. Its scope is confined to specific. and external sources Data cleaning i. sort. What are the applications of datawarehousing Three kinds of data warehouse applications Information processing supports querying. detect errors in the data and rectify them when possible Data transformation j. f. check integrity. get data from multiple. heterogeneous.. summarize. propagate the updates from the data sources to the warehouse . basic statistical analysis. compute views. tables. convert data from legacy or host format to warehouse format Load k. dependent directly from warehouse data mart . and build indicies and partitions Refresh l. charts and graphs Analytical processing . consolidate. A set of views over operational databases g. Define virtual warehouse. selected groups. such as marketing data mart Independent vs. . reporting and OLAP tools o. High quality of data in data warehouses i. Marks . ODBC. cleaned data n. OLAPbased exploratory data analysis i. Web accessing. DW contains integrated. pivoting. dicing. . pivoting Data mining knowledge discovery from hidden patterns supports associations. and presenting the mining results using visualization tools. drilling. Available information processing structure surrounding data warehouses i. integration and swapping of multiple mining functions. constructing analytical models. multidimensional analysis of data warehouse data supports basic OLAP operations. . Write short notes on data warehouse Meta data. service facilities. and tasks. mining with drilling. . p. performing classification and prediction. Online selection of data mining functions i. algorithms. Explain the Conceptual Modeling of Data Warehouses . slicedice. consistent. Give the Architecture of Datawarehouse and explain its usage. Why we need online analytical mining m. etc. Explain the operations performed on data warehouse with examples . State the difference between OLTP and OLAP in detail. OLEDB. What is meant by Data cleaning Fill in missing values. List the multidimensional measure of data quality i Accuracy ii Completeness iii Consistency iv Timeliness v Believability vi Value added vii Interpretability viii Accessibility . Why we need data preporocessing.Unit . Data in the real world is dirty i incomplete lacking attribute values. smooth noisy data. or containing only aggregate data ii noisy containing errors or outliers iii inconsistent containing discrepancies in codes or names . lacking certain attributes of interest. identify or remove outliers. or files . data cubes. Integration of multiple databases. Define Data integration. . and resolve inconsistencies. Why we need Data transformation . or senior. especially for numerical data . What is meant by Data discretization It can be defined as Part of data reduction but with particular importance. . middleaged. . Define Concept hierarchy. Define Data reduction. Why we need Data Mining Primitives and Languages unrealistic because the patterns could be too many but uninteresting o User directs what to be mined . Data reduction Obtains reduced representation in volume but produces the same or similar analytical results. What is the discretization processes involved in data preprocessing It reduces the number of values for a given continuous attribute by dividing the range of the attribute into intervals. . Interval labels can then be used to replace actual data values. It reduce the data by collecting and replacing low level concepts such as numeric values for the attribute age by higher level concepts such as young.o minmax normalization o zscore normalization o normalization by decimal scaling i New attributes constructed from the given ones . What tasks should be considered in the design GUIs based on a data mining query language i Data collection and data mining query composition ii Presentation of discovered patterns iii Hierarchy specification and manipulation iv Manipulation of data mining primitives v Interactive multilevel mining . Define Datamining Query Language. technology transfer. What are the types of knowledge to be mined . i A DMQL can provide the ability to support adhoc and interactive data mining ii By providing a standardized language like SQL a Hope to achieve a similar effect like that SQL has on relational database b Foundation for system development and evolution c Facilitate information exchange. commercialization and wide acceptance .mining system g these primitives in a data mining query language . constructs models for the database. predictive data mining i Descriptive mining describes concepts or taskrelevant data sets in concise. summarative. relevant data tion and visualization techniques to be used for displaying the discovered patterns . multiway join. indexing. List the five primitives for specification of a data mining task. aggregation. mining query is optimized based on mining query. informative.vi Other miscellaneous information . Descriptive vs. No couplingflat file processing. query processing methods.g. not recommended . Provide efficient implement a few data mining primitives in a DB/DW system. What are the types of Coupling data mining system with DB/DW system . discriminative forms ii Predictive mining Based on data and analysis. Tight couplingA uniform information processing environment DM is smoothly integrated into a DB/DW system. histogram analysis. e. and predicts the trend and properties of unknown data . etc. Loose coupling Fetching data from DB/DW . . indexing. What is the strength of Data Characterization i An efficient implementation of data generalization .. precomputation of some stat functions . Semitight couplingenhanced DM performance a. sorting. Give the basic principle of Attribute oriented induction. How Attribute oriented induction will be done c. . What is the basic algorithm for Attribute oriented induction . deriving the initial relation. b. Apply aggregation by merging identical. e.ii Computation of various kinds of measures a e. a. or As higher level concepts are expressed in terms of other attributes. f. threshold control typical . Interactive presentation with users. including dimensions. Give the list of limitations of Data Characterization.g. InitialRel Query processing of taskrelevant data. and the result is the initial relation. then select an operator and generalize A. Perform generalization by attribute removal or attribute generalization. removal remove attribute A if there is a large set of distinct values for A but there is no generalization operator on A. handle only dimensions of simple nonnumeric data and measures of simple aggregated numeric values. count . cant tell which dimensions should be used and what levels should the generalization reach . max iii Generalization and specialization can be performed on a data cube by rollup and drilldown . and there exists a set of generalization operators on A.. . generalization If there is a large set of distinct values for A. average . generalized tuples and accumulating their respective counts. Lack of intelligent analysis. relevant data. . sum . specified/default. Collect the taskrelevant data initial relation using a relational database query d. Analytical characterization data dispersion analysis. Presentation of data summarization at multiple levels of abstraction. mapping into rules. Frequency histograms i. Consists of a set of rectangles that reflect the counts or frequencies of the classes present in the given data .. . Automated desired level allocation. l. Whiskers two lines outside the box extend to Minimum and Maximum . Presentation User interaction adjust levels by drilling. Interactive drilling. pivoting. j. Differences i. accumulating the counts. Dimension relevance analysis and ranking when there are many relevant dimensions. PreGen Based on the analysis of the number of distinct values in each attribute. Define Histogram Analysis. slicing and dicing. The ends of the box are at the first and third quartiles. i. PrimeGen Based on the PreGen plan. Graph displays of basic statistical class descriptions q. . Data is represented with a box n.e. the height of the box is IRQ o.. State the difference between Characterization and OLAP. The median is marked by a line within the box p. pivoting. determine generalization plan for each attribute removal or how high to generalize . A univariate graphical method ii. visualization presentations. What is meant by Boxplot analysis m. cross tabs. perform generalization to the right level to derive a prime generalized relation. Similarity g. k. h. Sophisticated typing on dimensions and measures. . Displays all of the data allowing the user to assess both the overall behavior and unusual occurrences . Define Scatter Plot. For a data xi data sorted in increasing order. outliers and boxplots r. Q. Outlier usually. max u.. Q. Plots quantile information . whiskers. and plot outlier individually v. o Graphs the quantiles of one univariate distribution against the corresponding quantiles of another o Allows the user to view whether there is a shift in going from one distribution to another . a value higher/lower than . Interquartile range IQR Q Q t. median is marked. What is meant by Quantile plot . dependence ted by setting two parameters a smoothing parameter. How we can measure the dispersion of data Quartiles. Quartiles Q th percentile. and the degree of the polynomials that are fitted by the regression . Boxplot ends of the box are the quartiles. Define QuantileQuantile plot. Q th percentile s. M. pair of values is treated as a pair of coordinates and plotted as points in the plane . fi indicates that approximately fi of the data are below or equal to the value xi . x IQR . Five number summary min. Give the definition for Loess Curve. Unit . etc. probability that a transaction contains X Y Z n confidence. State the rule measure for finding association. conditional probability that a transaction having X Y also contains Z . clustering. catalog design. . Why do we preprocess the data Explain how data preprocessing techniques can improve the quality of the data. . What is data cleaning List and explain various techniques used for data cleaning . n support. c.What are the Applications of Association rule mining n Basket data analysis. . classification. crossmarketing. List out and describe the primitives for specifying a data mining task. How is Attribute Oriented Induction implemented Explain with an example . lossleader analysis. s. Explain major tasks in Data Preprocessing.Variance and standard deviation Marks . Describe how concept hierarchies and data generalization are useful in data Mining. ..K buysx. Above Single level vs. DMBook buysx.List the Methods to Improve Aprioris Efficiency. SQLServer buysx. DBMiner . . n Candidate itemsets are stored in a hashtree n Leaf node of hashtree contains a list of itemsets and counts n Interior node contains a hash table n Subset function finds all the candidates contained in a transaction .. PC . multiplelevel analysis .n What brands of beers are associated with what brands of diapers Why counting supports of candidates a problem n The total number of candidates can be very huge n One transaction may contain many candidates . incomex. lower support threshold a method to determine the completeness Dynamic itemset counting add new candidate itemsets only when all of their subsets are estimated to be frequent .. Hashbased itemset counting A kitemset whose corresponding hashing bucket count is below the threshold cannot be frequent Transaction reduction A transaction that does not contain any frequent kitemset is useless in subsequent scans Partitioning Any itemset that is potentially frequent in DB must be frequent in at least one of the partitions of DB Sampling mining on a subset of given data. n agex.What are the different way to find association Boolean vs. quantitative associations Based on the types of values handled n buysx. . multiple dimensional associations see ex. Single dimension vs.Give the Method to find supports of candidates. or it contains only one path single path will generate all the combinations of its subpaths.List the major step to mine FP Tree.Give the method for mininf frequent pattern using FP tree Structure. Construct conditional pattern base for each node in the FPtree Construct conditional FPtree from each conditional patternbase Recursively mine conditional FPtrees and grow frequent patterns obtained so far If the conditional FPtree contains a single path. construct its conditional patternbase. and then its conditional FPtree n Repeat the process on each newly created conditional FPtree n Until the resulting FPtree is empty.What is the advantage of FP Tree Structure Completeness n never breaks a long pattern of any transaction n preserves complete information for frequent pattern mining Compactness n reduce irrelevant informationinfrequent items are gone n frequency descending ordering more frequent items are more likely to be shared n never be larger than the original database if not count nodelinks and counts .What is meant by Nodelink property For any frequent item ai. simply enumerate all the patterns . n For each item. starting from ais head in the FPtree header Define Prefix path property .. each of which is a frequent pattern . all the possible frequent patterns that contain ai can be obtained by following ais nodelinks. Items at the lower level are expected to have lower support. What is the principle of frequent pattern growth. Define Multiple Level Association Rule. . What is meant by Uniform support. Items often form hierarchy. . no candidate test n Use compact data structure n Eliminate repeated database scan n Basic operation is counting and FPtree building . B be as conditional pattern base. Rules regarding itemsets at appropriate levels could be quite useful.To calculate the frequent patterns for a node ai in a path P. and b be an itemset in B. and its frequency count should carry the same count as node ai. Pattern growth property n Let a be a frequent itemset in DB. and is also faster than treeprojection n No candidate generation. Why Is Frequent Pattern Growth Fast n FPgrowth is an order of magnitude faster than Apriori. only the prefix subpath of ai in P need to be accumulated. Uniform Support the same minimum support for all levels . Transaction database can be encoded based on dimensions and levels We can explore shared multilevel mining . Then a b is a frequent itemset in DB iff b is frequent in B. What is meant by Icerberg query It Compute aggregates over one or a set of attributes only for those whose aggregate values is above certain threshold . . First apply rough/cheap operator superset coverage Then apply expensive algorithm on a substantially reduced candidate set .n One minimum support threshold.Define two or Multi Step mining. fine or rough Trade speed with quality stepbystep refinement. No need to examine itemsets containing any item whose ancestors do not have minimum support. What do you meant by Reduced Support It reduced minimum support at lower levels There are search strategies a Levelbylevel independent b Levelcross filtering by kitemset c Levelcross filtering by single item d Controlled levelcross filtering by single item . . What is the functionality of Superset mining It Preserve all the positive answersallow a positive false test but not a false negative test. n Lower level items do not occur as frequently. Why progressive refinement is suitable for reduced support Mining operator can be expensive or cheap. If support threshold n too high miss low level associations n too low generate too many high level associations . . D limitation n An alternative to ARCS Nongridbased equidepth binning clustering based on a measure of partial completeness. What is the limitations of ARCS n Only quantitative attributes on LHS of rules. State Distancebased association rules n This is a dynamic discretization process that considers the distance between data points. . What is categorical and Quantitative Attribute Categorical Attributes finite number of possible values. implicit ordering among values .. Define Quantitative association rules Quantitative attributes are dynamically discretized into binsbased on the distribution of the data. Step Detailed spatial algorithm as refinement Apply only to those objects which have passed the rough spatial association test no less than minsupport . Give the twostep mining of spatial association. no ordering among values Quantitative Attributes numeric. . Step rough spatial computation as a filter Using MBR or Rtree for rough estimation. n Only attributes on LHS. item Where X Customer.i Explain the methods to improve the Aprioris Efficiency. A. A. /Describe join and prune steps in Apriori Algorithm. Marks . B T // T // T // C. ii Construct the FP tree for given transaction DB . item I . Let minsup and minconf . B. .Discuss the approaches for mining databases multi dimensional association rule from transactional databases. V X transactions buysX. B. TID Date Itemsbought K. . D. A database has four transactions. C. Give suitable examples. A. B D.A. A. etc. D i ii Find all frequent itemsets using Apriori and FPgrowth respectively. E T // B. item buysX.Explain how mining will be done in frequent item sets with an example. item buysX. E. List all strong association rules matching the following Meta rule.Discuss the following in detail Association Mining Support Confidence Rule measures . c.c. Give the role of prediction in datamining. predicts categorical class labels b. List the typical Applications of classification and prediction credit approval target marketing medical diagnosis treatment effectiveness analysis .TID Frequent Itemsets f.a. classifies data constructs a model based on the training set and the values class labels in a classifying attribute and uses it in classifying new data . Define Supervised learning classification a.p f. It models continuousvalued functions. Supervision The training data observations.a.m.c.p unit . predicts unknown or missing values .b c.b. are accompanied by labels indicating the class of the observations . What is the functionality of Classification process a.a.p f.e.m.. i. measurements. etc.b.m f. observations. New data is classified based on the training set .b. etc. What is the process involved in Data Preparation Preprocess data in order to reduce noise and handle missing values Remove the irrelevant or redundant attributes Generalize and/or normalize data . What is meant by Unsupervised learning clustering a. How we can evaluate classification methods scalability o time to construct the model o time to use the model o handling noise and missing values o efficiency in diskresident databases o understanding and insight provded by the model s . with the aim of establishing the existence of classes or clusters in the data . Given a set of measurements. The class labels of training data is unknown b. they are discretized in advance . What is the condition to stop the partitioning a. State the functionality of Greedy Algorithm. What is meant by Decision Tree chartlike tree structure els or class distribution . There are no remaining attributes for further partitioning majority voting is employed for classifying the leaf c. What are the phases involved in Decision Tree a. Tree construction b. All samples for a given node belong to the same class b. down recursive divideandconquer manner inuousvalued.o decision tree size o compactness of classification rules . There are no samples left . Tree pruning branches that reflect noise or outliers . convertible to simple and easy to understand classification rules . to get the possible split values . Define Gini Index. Can be modified for continuousvalued attributes . State the two approaches to avoid overfitting. All attributes are assumed to be categorical b. such as clustering. Prepruning Halt tree construction earlydo not split a node if this would result in the goodness measure falling below a threshold b. Postpruning Remove branches from a fully grown treeget a sequence of progressively pruned trees decide which is the best pruned tree . Why decision tree induction in data mining a. a. relatively faster learning speed than other classification methods b.gain . What is meant by Information gain a. valued need other tools. Incremental Each training example can incrementally increase/decrease the probability that a hypothesis is correct. they can provide a standard of optimal decision making against which other methods can be measured . The target function could be discrete. Prior knowledge can be combined with observed data. All instances correspond to points in the nD space. . Why we need Bayesian Classification . can use SQL queries for accessing databases d.or real.c. posteriori probability of a hypothesis h. The nearest neighbor are defined in terms of Euclidean distance. comparable classification accuracy with other methods . . What is meant by KNearest Neighbor Algorithm . PhD follows the Bayes theorem PCX PXCPC / PX C such that PXCPC is maximum .valued. Standard Even when Bayesian methods are computationally intractable. . Probabilistic prediction Predict multiple hypotheses. . CCC Given training data D. Probabilistic learning Calculate explicit probabilities for hypothesis. weighted by their probabilities . among the most practical approaches to certain types of learning problems . g. Define case based reasoning approach. and problem solving . An initial population is created consisting of randomly generated rules e.. a. State the functionality of Rough set Approach. construct a model . o approximately or roughly define equivalent classes be in C and an upper approximation cannot be described as not belonging to C the minimal subsets reducts of attributes for feature reduction is NPhard but a discernibility matrix is used to reduce the computation intensity . a new population is formed to consists of the fittest rules and their offsprings . The fitness of a rule is represented by its classification accuracy on a set of training examples .. Tight coupling between case retrieval. the kNN returns the most common value among the k training examples nearest to xq. Define Prediction with classification. Multiple retrieved cases may be combined c.g. For discretevalued. i Prediction is similar to classification a. GA based on an analogy to biological evolution . IF A and Not A then C can be encoded as . Instances represented by rich symbolic descriptions e. Based on the notion of survival of the fittest. function graphs b. knowledgebased reasoning. Offsprings are generated by crossover and mutation . First. . Each rule is represented by a string of bits . What is the role of Genetic Algorithm .. How we can estimate error rates . Many nonlinear functions can be transformed into the above. divide the data set into k subsamples . test set/ . use k subsamples as training data and one subsample as test data . Linear and multiple regression . Bootstrapping leaveoneout . for small size data .kfold crossvalidation .g. for data set with moderate size . Partition Trainingandtesting . Y. used for data set with large number of samples . What are the types of Prediction Linear regression Y a b X a. . . X. b. use two independent data sets. Classification refers to predict categorical class label b. Loglinear models . Multiple regression Y b b X b X. Two parameters . X. Nonlinear regression ii Prediction is different from classification a. Major method for prediction is regression . Second.. using the least squares criterion to the known values of Y. use model to predict unknown value i. Prediction models continuousvalued functions .b. a and b specify the line and are to be estimated by using the data at hand. Crossvalidation . e. training set /. b. Probability pa. What is meant by Boosting .The multiway table of joint probabilities is approximated by a product of lowerorder tables. i. detect spatial clusters and explain them in spatial data mining iii. Spatial Data Analysis a. Boosting increases classification accuracy Applicable to decision trees or Bayesian classifier . State the role of cluster Analysis. Similar to one another within the same cluster d. Give the applications of clustering. Image Processing iv. where each classifier in the series pays more attention to the examples misclassified by its predecessor . Boosting requires only linear time and constant space . create thematic maps in GIS by clustering feature spaces b. d aab baccad dbcd . Pattern Recognition ii. Economic Science especially market research v. Dissimilar to the objects in other clusters Cluster analysis e. Grouping a set of data objects into clusters Clustering is unsupervised classification no predefined classes . c. WWW . Cluster a collection of data objects c. Learn a series of classifiers. What is the requirement for clustering in data mining Scalability Ability to deal with different types of attributes Discovery of clusters with arbitrary shape Minimal requirements for domain knowledge to determine input parameters Able to deal with noise and outliers Insensitive to order of input records High dimensionality Incorporation of userspecified constraints Interpretability and usability . Hierarchy algorithms Create a hierarchical decomposition of the set of data or objects using some criterion . What are the algorithms used for clustering . What are outliers ssimilar from the remainder of the data . Partitioning algorithms Construct various partitions and then evaluate them by some criterion . Cluster Web log data to discover groups of similar access patterns . Document classification g. Gridbased based on a multiplelevel granularity structure . Densitybased based on connectivity and density functions . Modelbased A model is hypothesized for each of the clusters and the idea is to find the best fit of that model to each other .f. Describe the working of PAM Partioning Around Medoids algorithm. . Unit . Marks . Which attribute is said to be set valued attribute level concepts e set. Plan mining extraction of important or significant generalized sequential patterns from a planbase a large collection of plans . Briefly outline the major steps of decision tree classification. or the weighted average for numerical data videogames . Discuss Bayesian classification with its theorem What is prediction Explain about various prediction techniques. How we classified Sequence value Attribute valued attributes except that the order of the elements in the sequence should be observed in the generalization . Explain the measure of attributes in decision tree induction and outline the major steps involved in it . such as the number of elements in the set. Discuss the different types of clustering methods. Define Plan mining. the types or value ranges in the set. . . . . . and time of creation intensive if performed manually . What is meant by Spatial trend analysis ial dimension distance from an ocean . size. Online aggregation collect and store pointers to spatial objects in a spatial data cube o expensive and slow. Define Descriptionbased retrieval systems and perform object retrieval based on image descriptions. Give the methods for computing spatial data cube. need efficient aggregation techniques Precompute and store all the possible combinations o huge space overhead Precompute and store rough approximations in a spatial data cube o accuracy tradeoff . such as keywords. What are the steps needed to make spatial association Twostep mining of spatial association tree for rough estimation minsupport . captions. Discuss some of the application using data mining system . Explain mining WWW process. Explain the concept involved in multimedia database. Describe how multidimensional analysis performed in data mining system. Give some examples for text based database and explain how it is implemented using datamining system. List the descriptors present in multidimensional analysis.. . Describe the trends that cover data mining System in detail . How spatial database helpful in data mining system . . visual characteristic edge layout vector . . . Explain the ways in which descriptive mining of complex data objects is identified with an example . What is the requirement for maintaining Timeseries database series components rregular Marks . State the process involved in Contentbased retrieval systems and wavelet transforms . W WZSWVZWSZSXWZaTWXVZUbSaWZWSUSTaWVWWZW YWZWSSZSZXWSUSTaWWbScYYWZWSW W WZSWVZWW WZSZWXYWZWSSZWYWbWVWbWS WYWZWSWVWSZSUUaaSZYWUaZ WWZSZWZWSUZ SVaWbWTVZY bZY SZYZ aWUSTbaSSZWWZSZ SWWVXXWWZUWTWcWWZSSUWSZSZV. S YWWZSZXVSSaSSZSaWWbWXSTSUZ ZWSUbWVZYbZYUZYSZVVUZY XXWWZUW aSWVVWWVWbWSUSZ WZZWWbSZUWSZSSZVSZZYcWZWWSWSZWWbSZVWZZ USWVZYZVWZZSZVWSaW ZSUSUSSUWSZVSSVWZSZS SWSZTSZS SSWWWZWVcST ZWWZVXWTSWSWXSZVVaSWWWWYXWT WWVSZSWVTSZWcZWT WcZWaVWWTWWZV ZaSZV Sa WXZW YSZS SVSXTSUSUSUSVWUZ WaWZUYS aZbSSWYSUSWV ZXSWXWUSZYWSWXWUWUaZXWaWZUWXWUSWWWZZ WYbWZVSS . SWSZTaSZW SSXWVSSScZYWaWSWTWbWSTWSbSZVaZaaS UUaWZUW aSZWZXSZ SVSSVSSWVZZUWSZYVWXZVUSWSSSW VSSSWTWcWaSWbSaW WXZWaSZWaSZW SWaSZWXZWaZbSSWVTaZSYSZWUWZVZYaSZWX SZW cWaWbWccWWWWSXZYZYXZWVTaZSZW WXZWUSW bVWSXSTbSSWVSSWWUaWXZaWWU SUSXbSaWWSWVSSSXUVZSWSZVWVSZZWSZW bWWVWXZZX WabW VVSUabWSUSWZVWbVWTWWWUWZXWSWZX VWWZVWZUW WUabWXWVTWZYcSSWWSZYSSWWSZVWVWYWWXW ZSSSWXWVTWWYWZ ccWUSZWSaWWVWZXVSS aSWaWSZVT aSW WUWZW WUWZW XXW ZWaSWSZYW bWZaTWaSZ S aWZVXWTSWWaSWWVSZSWVcWSZVaW ZVbVaS b. aWaaSSbSaWYWcWSZ . SSZUWSZVSZVSVVWbSZ SSZUW SYWTSUUSSTWUaSZ SZVSVVWbSZWaSWXbSSZUW S SZSSZSSWUWZY SVSSUWSZZY SZVWSZbSaWUZaWaWVXVSSUWSZZY cTaW. WZWV ZVaUZWWZWVSZcSZWSW VcWWUWWVSSSZcVSSWUWZYWUZaWUSZbWW aSXWVSS aSZVVWUTWWbWXWUXZYSVSSZZYS WUTWcUZUWWSUWSZVVSSYWZWSSZSWaWXaZVSS ZZY Z SSWWUSZXUSZaWZZY ZSWVSSSZSUSWZYUSSYVWYZWSVWSZSUaWZY USXUSZWU SWWaWWSaWXXZVZYSUSZ ZaTSTSSSZSUZUZSZ ZUZXVWZUWUUZVZSTSTSSSZSUZSbZY SUZSZ . SSWWVXXWWZcSXZVSUSZ WSZbaSZSbWSUSZSWVZWWXbSaWSZVWV ZTa WbW Ta Ta ZW ZSYW ZUW Ta ZYWVWZZbaWVWZZSSUSZWWWTbW ZYWWbWbaWWbWSZS ZSTSZVXTWWSWSUSWVccSTSZVXVSW UaZZYaXUSZVVSWSTW ZWSZaTWXUSZVVSWUSZTWbWaYW Z. ZWSZSUZSUZSZSZUSZVVSW bWW WVXZVaXUSZVVSW ZSZVVSWWWSWWVZSSWW Z WSXZVWXSWWUZSZSXWWSZVUaZ Z ZWZVWUZSZSSSTW ZaTWXaZUZXZVSWUSZVVSWUZSZWVZSSZSUZ W WV bWXXUWZU STSWVWWUaZZYWWcWUWZVZYSZYTaUWUaZTWc WWVUSZZTWXWaWZ SZSUZWVaUZSZSUZSVWZUZSZSZXWaWZWWaWWZ aTWaWZUSZ SZZYZWWSWZSXWaWZZaTWXWaWZZSWSZWX WSZX SZYZZYZSaTWXYbWZVSScWaWV SWVVWWZW WUWWZW ZSUWWUaZZYSVVZWcUSZVVSWWWZcWZSXWaTWSW WSWVTWXWaWZ . SWSVbSZSYWXWWaUaW WWZW ZZWbWTWSSZYSWZXSZSZSUZ ZWWbWUWWZXSZXXWaWZSWZZZY SUZW ZWVaUWWWbSZZXSZZXWaWZWSWYZW ZXWaWZUVWUWZVZYVWZYWXWaWZWSWWWTWSWV ZZWbWTWSYWSZWYZSVSSTSWXZUaZZVWZSZVUaZ bWWWVXZZXXWaWZSWZaZYWWaUaW ZWSUWUZaUUZVZSSWZTSWSZVWZUZVZSWW ZWWSWUWZWSUZWcUWSWVUZVZSWW ZZWWaZYWWWUZSZZZWSZYWScYWZWSWS WUTZSZXaTSWSUXcUSXWaWZSWZ WSWZWWW ZaUUZVZSSWZTSWXWSUZVWZWWW ZaUUZVZSWWXWSUUZVZSSWZTSW WUabWZWUZVZSWWSZVYcXWaWZSWZTSZWVXS XWUZVZSWWUZSZSZYWSWZaWSWSWSWZ SWSZTVWZW SZXWaWZWSSWTWXWaWZSWZSUZSZSUSZTWTSZWVT XcZYS ZVWZSZYXS WSVZWWWWSVW WXZWWXSW . USUaSWWXWaWZSWZXSZVWSZSSZWWXaTSXSZ ZWWVTWSUUaaSWVSZVXWaWZUUaZaVUSWSWUaZSZVWS SWZUWXXWaWZSWZYc SWZYcW Z WSTWSXWaWZWWZTWS UZVZSSWZTSWSZVTTWSZWWZ WZSTSXWaWZWWZXXTXWaWZZ WaWZSWZ cS ZYcSZVWXSYZaVWXSWSZSZVSXSWSZWWWUZ ZUSZVVSWYWZWSZZUSZVVSWW ZWUSUVSSaUaW ZZSWWWSWVVSSTSWUSZ ZSUWSZUaZZYSZVWWTaVZY SWSZT UWTWYaW aWSYYWYSWbWZWSWXSTaWZXWcWSYYWYSW bSaWSTbWUWSZWV WXZW aW WbWUSZaW WXWZXWSU WSWcWWbWSWWWUWVSbWcWa aWWYSVZYWWSSSWWbWUaVTWaWaWXa SZSUZVSSTSWUSZTWWZUVWVTSWVZVWZZSZVWbW WUSZWWSWVaWbWZZY SWSZTZXa ZXaWSWZaaXSWbW . Z. ZWZaaWVZWWVWSZWWWUZSZZYSZWcW SZUWVZSbWZaa Z cWWbWWVZUUaSXWaWZ XaWV ZYRcWbWSUSZ ZcRYWZWSWSZYWbWSUSZ SVaWSZTWVaUWVa WVaUWVZaaScWWbW WWSW WSUSWYW S WbWTWbWZVWWZVWZ T WbWUXWZYTWW U WbWUXWZYTZYWW VZWVWbWUXWZYTZYWW YWbWWXZWWZaSTWXWVaUWVa ZZYWSUSZTWWWZbWUWSXZWaY SVWWWVcaSWTWWXZWWZ SWXaZUZSXaWWZZY WWbWSWbWSZcWScSbWXSWWTaZSXSWZWYSbWW WXZWc aWZZY SaYUWSWSaWWUbWSYW WZSWWZbWSYZSaTSZSWVaUWVUSZVVSWW . SUSWYUSSZVaSZSbWTaW SWYUSTaW XZWZaTWXTWbSaWZVWZYSZYbSaW aSZSbWTaW ZaWUUVWZYSZYbSaW SWSZX Z. ZaSZSbWSTaWZ Z. Z STaWZ ZZSWZSbW ZYVTSWV WaVWTZZZY UaWZYTSWVZSWSaWXSSUWWZW WXZWaSZSbWSUSZaW aSZSbWSTaWSWVZSUSVUWWVZTZTSWVZWVTaZXW VSS SWSZUWTSWVSUSZaW ZSVZSUVUWSZUWSUZVWWVSZUWTWcWWZVSSZ bWWcWZZYXSSSUSZ XaW SZ W aYSSUaSZSSXW ZY WWXaYWSZ W WSWVSSSYSWXZWWZ ZWTWUcUSbWSWVWaYSSSUSZWZWSZ Z a . S UaWXcZYZVWS O O O O USZ ZZY a ZXVWZUW aWWSaW SZcZZYcTWVZWZXWaWZWWcSZWSW WUTWZSZVaZWWZY UaWSSUWXZZYVSSTSWaVWZZSSUSZaWX SZSUZSVSSTSW bWaSTWWSW VSSTSWSXaSZSUZ WZ a SW SZVZ UZX W TaY ZVSXWaWZWWaZYSZVYcWWUbW SZYSUSZaWSUZYWXcZY WSaW SZSUZTaW TaW TaW WWaWW WU SZWWVbWWXXUWZU ZaUWWWXYbWZSZSUZ . aZ SWXaZUZSXSXUSZUW SWVUUSWYUSUSSTW WaWZ WW XUS XUST XT UT XUS TUSXWVSSUZaUSVWTSWVZWSZZYWSZVWbSaWUSSTWZS USXZYSTaWSZVaWZUSXZYZWcVSS bWWWXWVUZZVSSZZY VWUZZaabSaWVXaZUZWWVUaZZcZZYbSaW WUSUSZXUSXUSZSZVWVUZ UWVSbS SYWSWZY WVUSVSYZ WSWZWXXWUbWZWSZS WXZWaWbWVWSZZYUSXUSZ SaWbZWSZZYVSSTWbSZWSaWWZWUSWSUUSZWVTSTW ZVUSZYWUSXWTWbSZ . TWcVSSUSXWVTSWVZWSZZYW SWSZTZaWbWVWSZZYUaWZY SWUSSTWXSZZYVSSaZZcZ T bWZSWXWSaWWZTWbSZWUcWSXWSTZYWWWZUWX USWUaWZWVSS SWUWZbbWVZSSWSSZ SSUWSZZY WUWVSSZVWWVaUWZWSZVSZVWZYbSaW WWbSZUWSZSXWSaWWWUZ WbWWWWbSZWVaZVSZSTaW SSSZXSZ WZWSWSZVZSWVSS ccWUSZWbSaSWUSXUSZWV WVUbWSUUaSU WWVSZVUSST WUZaUWVW WaWWVW TaZW SZVZYZWSZVZYbSaW USST WXXUWZUZVWVWZVSSTSW ZWWST aZVWSZVZYSZVZYbVWVTWVW VZWXaW . VWUZWWW USUZWXUSXUSZaW SWSZTWUZWW XcUSWWWaUaW ZWZSZVWVWZWSWZSZSTaW SZUWWWZSZaUWXWW WSXZVWWWWZUSSTWUSVTaZ SSWW SWZbbWVZWUZWW SWWUZaUZ SSWSZZYWSWSWSW SZWSWWUabWTSWVZWWUWVSTaW TWWaZZY VWZXSZVWbWTSZUWSWXWUZWaW SWUZVZWSZZY SSWXSYbWZZVWTWZYWSWUS TWWSWZWSZZYSTaWXXaWSZZYSbZYWWVX USXZYWWSX UWWSWZSWWX SWWXaZUZSX WWVY WWUZaUWVZSVcZWUabWVbVWSZVUZaWSZZW SSWSZZYWSWSWSW TaWSWUSWYUSXUZZaabSaWVWSWVUWWVZSVbSZUW . SWSWSZWVWUabWTSWVZWWUWVSTaW WSTaWSWWWUWVZWTSXSWaUSUSWSaWWYZXSZ YSZ SWSZT ZXSZYSZ SSTaWSWSaWVTWUSWYUS TSZTWVXWVXUZZaabSaWVSTaW WXZW Z ZVW STaWSWSaWVUZZaabSaWV aWWWWWbWSTWbSaWXWSUSTaW SZWWVWaUSUaWZYYWWTWbSaW SZTWVXWVXUSWYUSSTaW SWWcSSUWSbVbWXZY SWaZZY SWWUZaUZWSVZSZVWXcaVWaZW YVZWWSaWXSZYTWcSWV XXUaUWSZSSWWV TaZZYWbWTSZUWXSXaYcZWWYWSWaWZUWXYWbW aZWVWW WSWXVSSVXXWWZXWSZZYVSSVWUVWcUWTWaZWVWW VWUZWWZVaUZZVSSZZY SWSbWXSWWSZZYWWVSZWUSXUSZWV TUZbWTWWSZVWSaZVWSZVUSXUSZaW . UUSZaW aWWXSUUWZYVSSTSW VUSSTWUSXUSZSUUaSUcWWV cWZWWVSWSZSXUSZ TSTUWSZZYSUaSWWUTSTWXWSZYW SUUSSSUWUWSZWXWSZZYTW ZUWWZSSUSZZYWSWUSZZUWWZSZUWSWVWUWSWWTSTS SWUWUZcWVYWUSZTWUTZWVcTWbWVVSS TSTUWVUZWVUaWWWcWYWVTWTSTW SZVSVbWZcWZSWSZWVSWUaSZSZSUSTWWUSZbVWS SZVSVXSVWUZSZYSYSZcUWWVUSZTWWSaWV bWZSZZYVSSWTSTXSWXcWSWWW SWWW UZSZXSUSW WSbWXWXUSSW aUSSa aUSSa TWUaZYaZXWSTW SWSZT WSWWYTY ZSZUWUWZVZZWZSUW WZWSWZWYTSWVWXZWVZWXaUVWSZVSZUW WSYWXaZUZUaVTWVUWWWSbSaWV . VUWWbSaWVWWaZWUZbSaWSZYWSZZYWSW ZWSW WXZWUSWTSWVWSZZYSSU S ZSZUWWWWZWVTUTUVWUZWYXaZUZYS T aWWWbWVUSWSTWUTZWV UYUaZYTWcWWZUSWWWbSZcWVYWTSWVWSZZYSZVTWbZY SWWX WZWUY TSWVZSZSZSYTYUSWbaZ SUaWWWWZWVTSZYXT ZZSaSZUWSWVUZZYXSZVYWZWSWVaW WY SZV WZ USZTWWZUVWVS SWVZWZZXabbSXWXWSZWcaSZXWVUZXW XWaWSZVWXXZY WXZWXSaWWWWZWVTUSXUSZSUUaSUZSWXSZZYWSW . XXZYSWYWZWSWVTUbWSZVaSZ SWWXaZUZSXaYWSU aYWSWaWVSSWaYVWXZWWabSWZUSW aYWXSYbWZUSSSWVTcWScWSSZUWSZ TWZSZVSZaWSSZUSZZTWVWUTWVSZTWZYZY ZVZYWZSaTWWVaUXSTaWXXWSaWWVaUZSVTaS VUWZTSaWVWVaUWWUaSZZWZ WXZWWVUZcUSXUSZ WVUZSUSXUSZ SUZaUSVW . TWUZVaWVWWVUaZZcZbSaW SWVXWVUZWYWZ ZWSSZVaWWYWZ ZZWSWYWZ WVUZVXXWWZXUSXUSZ SSXUSZWXWWVUUSWYUSUSSTW TWVUZVWUZZaabSaWVXaZUZ ccWUSZWSWWSW SZSZZYSZVWZY aWcZVWWZVWZVSSWWYSZZYW WW aWVXVSSWcSYWZaTWXSW bSVSZ VbVWWVSSWZaTSW aW aTSWSSZZYVSSSZVZWaTSWSWVSSXVUbSVSZ XVSSWcVWSWW SZYWSbWZWa XSWVSS SSWWWXWVUZ ZWSWYWZ S T ScSSWWSSZVTWUXWZWSZVSWTWWSWVTaZYWVSSSSZV TaZYWWSaSWUWZWZcZbSaWX aWWYWZ T T T SZZZZWSXaZUZUSZTWSZXWVZWSTbW YZWSVW . WacSSTWXZTSTWSSWVTSVaUXcWVWSTW TSTSTUV SSTTSUUSVVTUV SWSZTZY ZYZUWSWUSXUSZSUUaSU USTWVWUZWWSWSZUSXW WSZSWWXUSXWcWWWSUUSXWZWWWSWSWZZW WSWUSXWVTWVWUW ZYWaWZZWSWSZVUZSZSUW SWWWXUaWZS aWSUWUZXVSSTWU USZWSZWcZWSWUaW VSWTWUZWUaW aWSZS W aZYSWXVSSTWUZUaW aWZYaZaWbWVUSXUSZZWVWXZWVUSW bWWSUSZXUaWZY SWZWUYZZ SSSSZS SUWSWWSUSZ TUaWZYXWSaWSUW TVWWUSSUaWSZVWSZWZSSVSSZZY SYWUWZY bUZUUWZUWWWUSSWWWSU b . XUaWZUSXUSZ YaWWTYVSSVUbWYaXSSUUWSWZ SWWaWWZXUaWZYZVSSZZY USST TVWScVXXWWZWXSTaW UbWXUaWcSTSSW ZSWaWWZXVSZZcWVYWVWWZWZaSSWW TWVWScZWSZVaW ZWZbWVWXZaWUV YVWZZS ZUSZXaWWUXWVUZSZ ZWWSTSZVaST SSWWSYaWVXUaWZY SZZYSYZaUbSaSZSZVWZWbSaSWWTW UWZ WSUSYWSWSWSUUSVWUZXWWXVSSTWUaZY WUWZ WZTSWVTSWVZUZZWUbSZVVWZXaZUZ VTSWVTSWVZSaWWbWYSZaSaUaW VWTSWVVWWWVXWSUXWUaWSZVWVWSXZVWTW XXSVWWSUW SSWaW WWXTWUSWUZVWSTVSXWWSZVWXWVSS SW USW VZSZW W . S UaSWSZUSXUSZcWW SWVUZSZSTabSaWVUZWUZaW WXaZWWSWXVWUZWWUSXUSZ UaWVXXWWZWXUaWZYWV WUTWWcZYX SZZYaZV WVVSY SZWWSaWXSTaWZVWUZWWZVaUZSZVaZWWSW ZbbWVZ Z USTaWSVTWWbSaWVSTaW WZWSSZXWSUbSaWZWWZUWZVZYYWWbWUZUW WbSZXWYWZWSTWSbXWWaUSWZaTWXWWWZZWWW WbSaWSZYWZWWWcWYWVSbWSYWXZaWUSVSS YTT WZZUWUWbZZZWZV YSWYWZWSWaU bVW YSW ccWUSXWVWaWZUWbSaWTaW SWSWbSaWVSTaWWUWSWVWXWWWWZZWWaWZUWaVTW TWbWVZWYWZWSSZ WXZWSZZZY SZZZYWSUZXSZYZXUSZYWZWSWVWaWZSSWZXS SZTSWSSYWUWUZXSZ YUbWSbWSWZZSZSXYVSSTSW XZVYZXUSZSWZXWWaWZUWXSUZZWWSXSaTW . bWWWVXUaZYSSVSSUaTW . ZZWSYYWYSZUWUSZVWZWSSTWUZSSSVSSUaTW WWZbWSZVcZWWVWXXUWZSYYWYSZWUZaW WUaWSZVWSWTWUTZSZ aYWSUWbWWSV WUaWSZVWaYSSZZSSSVSSUaTW SUUaSUSVWXX SSWW WZWWVWVSWSSSUSZ cWZZYXSSSUSZ W aYSSUaSZSSXW ZY WWXaYWSZ W WSWVSSSYSWXZWWZ ZWTWUcUSbWSWVWaYSSSUSZWZWSZ Z a SWSZTSSWZVSZS WWUUSZYWSZVWZVSZYSSSVWZZ aVWWZVXZZSSSSVSSUSZYZYcSUW SW. TWbWWWZVXUSZYWXWUSWbWYWSZcWZUWSZY VSZUWXSZUWSZ WXZWWUZTSWVWWbSW aVZVUWSZVWXTWUWWbSTSWVZSYWVWUZaUSWcV USZWSZVWXUWSZ STZWZbWXWXWVSZaS WaSWUSXaSXSaSWV . SWWUWZbbWVZZWZTSWVWWbSW aWWbSTSWVZWSYWUZWZaUSUYSWaWSWTWU SZVcSbWWSZX WVWUWWZZaVWZZSSZS WSaWVWUSWXbWUXWSUbaSUSSUWU bWUUZSZWUYS WaWZbWUXbWUUWZV . WaWZ. WZSZbWUXbWWVYWWZSZUWZV SaVWUUZSZSUSabWUSZVSZWVYWSabWU SWWaWWZXSZSZZYWWWVSSTSW ZXWaWZUWXbSaWWbWZUSZYZYcW SSWUVWVSWYaSZWbS SSUWUWWWUZWZ WZVUUWWSZSWYaS S bWWWSWXWTSWVVSSTSWSZVWSZcWWZWVaZY VSSZZYW SZZZYUW UaWXWSUSZaZYVSSZZYW WUTWWWZVSUbWVSSZZYWZVWS WUTWcaVWZZSSZSWXWVZVSSZZYW SZWcSZcUVWUbWZZYXUWVSSTWUVWZXWVcSZ WSW cSSVSSTSWWXaZVSSZZYW SZWUZUWZbbWVZaWVSVSSTSW .