Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Object storage wikipedia , lookup
Predictive analytics wikipedia , lookup
Operational transformation wikipedia , lookup
Business intelligence wikipedia , lookup
Forecasting wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Data vault modeling wikipedia , lookup
Versant Object Database wikipedia , lookup
Bayesian inference in marketing wikipedia , lookup
Relational algebra wikipedia , lookup
Collective Classification A brief overview and possible connections to email-acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie Mellon University November 10th 2004 Data Representation • “Flat” Data – – – – Object: email msgs Attributes: words, sender, etc Class: spam/not spam Usually assumed IID Not spam spam spam Not spam spam • Sequential Data – Object: words in text – Attr: capitalized, number, dict – Class: POS (or name/not) • Relational Data – class+attributes – +links(relations) – Example: webpages pron verb name det name J. Neville et al., 2003 Relational Data and Collective Classification •Different objects interact •Different types of relations (links) •Attributes may be correlated •Examples: – actors, directors, movies, companies – papers, authors, conferences, citations – company, employee, customer, Classify objects collectively Use prediction on some objects to improve prediction on related objects Collective Classification Methods • Relational Probability Trees (RPT) • Iterative methods (Relaxation-based Methods) • Relational Dependency Networks (RDN) • Relational Bayesian Networks (RBN/PRM) • Relational Markov Networks (RMN) • Other models (ILP based, Vector Space based, etc) •Overall: – Lack of direct comparison among methods – Results are usually compared to “flat” model – Splitting data into train/test sets can be an issue Relational Probability Trees • Decision Trees applied to Relational data • Predicts the target class label based on: – same object attributes – attributes + links in “relational neighborhood” (one link away) – counts of attributes and links in the “neighborhood” • Enhanced feature selection (Chi-square, pruning, randomization tests) • Results were not exciting •Neville et al. KDD2003, related work from Blockeel et al. (Artificial Intelligence, 1998), Kramer AAAI-96 Iterative Methods • Predicts the target class label based on: – – – – • Same object attributes Attributes and links of relational neighborhood CLASS LABEL of neighborhood Features derived from CLASS LABELS Different update strategies: – – – By threshold in prediction confidence By top-N most confident predictions Heuristic-based • Slattery & Mitchell, ICML-2000;Neville & Jensen, AAAI-2000; Chakrabarti et al. ACMSIGMOD-98 • Some results with Email-acts Relational Bayesian Networks (RBN/PRM) • Bayes Net extended to Relational domain • Given an “instantiation”, it induces a bayes-net that specifies a joint probability distribution over all attributes of all entities • Directed graphical model, with acyclicity constraint. • Exact model - Closed form for parameter estimation – Products of conditional probabilities • Was applied to simple domains, since the acyclicity constraints is very restrictive to most relational applications • Friedman et al, IJCAI-99; Getoor et al., ICML-2001; Taskar et al. IJCAI-2001 Relational Markov Networks (RMN) • Extension of CRF idea to Relational Domain • Given an instantiation, it induces a Markov Network that specifies a probability distribution of labels, given links and attributes • Undirected, Discriminative model • Parameter estimation is expensive, requires approximate probabilistic inference (belief propagation) •Taskar et al., UAI2002 Relational Dependency Networks (RDN) • Dependency Networks extended to Relational domain • P(X) = π [ Prob (Xi | Neighbor(Xi)) ] • Given an “instantiation”, it induces a DN that specifies an “approximate” joint probability distribution over all attributes of all objects • Undirected graphical model, no acyclicity constraint. • Approximate model - Simple parameter estimation – approximate inference (Gibbs sampling) • Neville & Jensen, KDD-MRDM-2003 Other Models From Neville et al., 2003 Comparing Some Results • Comparing PRM, RMN, SVM and M^3N • Diff: PRM and RMN • Diff: mSVM and RMN • RN* (Relational Neighbor) is a very simple Relational Classifier • • RN* (Macskassy et al., 2003) M^3N(Taskar et al., 2003) PRM RMN End of overview… now, the email-act problem • Strong correlation with previous and next message Proposal Request Delivery Request Proposal Commit Commit Request Request Delivery • A “verb” has little or no correlation with other “verbs” of sameCommit message Acknowled • Flat data? Delivery • Sequential data? Time