Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Perception-Based Classification (PBC) System Salvador Ledezma [email protected] April 25, 2002 Introduction Concepts Demo of PBC References: “Towards and Effective Cooperation of the User and Computer for Classification” “Visual Data Mining with Pixel-oriented Visualization Techniques” “Visual Classification: An Interactive Approach to Decision Tree Construction” Mihael Ankerst, author or coauthor Data Mining Exploration and Analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and rules Part of Knowledge Discovery in Databases (KDD) process Classification Major task of Data Mining Assign object to one of a set of given classes based on object attributes Classification Algorithms Decision Tree Classifier Training set – set of objects whose attributes and class is already known Using training set, tree classifier determines a classification function represented by a decision tree Model for class attribute as a function of the values of other attributes Test set – validates the classification function Classification Example Classification (cont) Usually algorithms are black boxes with no user interaction or intervention Reasons for user involvement in decision tree construction: Use human pattern recognition capabilities User will have better understanding of tree User provides domain knowledge Visual Data Mining Tackle data mining tasks by enabling human involvement Incorporating perceptivity of humans Visual Classification Construction of decision trees is decomposed into substeps Enables human involvement Example: PBC Data visualization based on 2 concepts Each attribute of training data is visualized in a separate part of screen Different class labels of training objects are represented by different colors Pixel-Oriented Visualization Techniques Represent each attribute value as a single colored pixel Map the range of possible attribute values to a fixed color map Maximizes the amount of information represented at one time without any overlap Circle Segments Technique Data is a circle divided into segments Each segment represents an attribute Attribute values are mapped by a single colored pixel and arrangement starts in the center and proceeds outward Example Light = high stock price Dark = low stock price Represents 50 stocks. 1 circle represents the prices of different stocks at the same time Bar Visualization For each attribute Attribute values are sorted into attribute lists Classes are defined by colors Within a bar, sorted attribute values are mapped to pixels, line by line Each attribute is placed in a different bar DNA Training Data Attribute 85 and attribute 90 visually are good candidates for splitting tree Algorithm picks 90 as the optimal split PBC Uses pixel-oriented visualization Visualizes training data in order to support interactive decision tree construction Examples of use Automatic Automatic-manual (top 2 levels) Manual-automatic Manual Actual use lies somewhere in between this spectrum Additional Functionality Propose split Look-ahead For a hypothetical split Expand tree Automatic expanding and construction PBC demo