Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
HIDDEN CONCEPT DETECTION IN GRAPHBASED RANKING ALGORITHM FOR PERSONALIZED RECOMMENDATION Nan Li Computer Science Department Carnegie Mellon University INTRODUCTION Previous work: Represents past user behavior through a relational graph. Fail to represent individual differences among items of a same type. Our work: Detect hidden concepts embedded in the original graph Build a two-level type hierarchy for explicit representation of item characteristics. RELATIONAL RETRIEVAL 1. 2. Entity-Relation Graph G=(E, T, R): • Entity set E={e} Entity types set T={T} Entity relations R={R} • Each entity e in E has its type e.T . • Each relation R has two entity types R.T1 and R.T2. If two entities has relation R, then R(e1, e2) = 1, o/w 0. Relational Retrieval Task: Query q = (Eq , Tq) • Given Eq = {e’}, predict the relevance of each entity e of the target type Tq. PATH RANKING ALGORITHM Relational Path: P = (R1, R2, …, Rn) R1.T1=T0 and Ri.T2=Ri+1.T1. 2. Relational Path Probability Distribution: The probability corresponds to the probability of a path random walker reaching that entity from a query entity. 1. PRA MODEL (G, l, θ) • The feature matrix A has its each column to be the distribution hp(e). • The scoring function: TRAINING PRA MODEL 1. 2. 3. Training data: D = {(q(m),y(m))}, ye(m)=1 if e is relevant to the query q(m) Parameter: The weight of path θ Objective function: HIDDEN CONCEPT DETECTOR (HCD) Two-Layer PRA Find hidden subtype of relations autho r title pape r gene journ al yea r autho r title pape r gene journ al yea r BOTTOM-UP HCD Bottom-Up merging algorithm: For each relation type Ri Step1: Divide every starting node of relation Ri as a subrelation Rij. author paper author paper Step2: HAC: Each time merge two subrelations Rim and Rin to maximize the gain of objective functions until no positive gain: author paper author paper APPROXIMATE THE GAIN OF OBJECTIVE FUNCTION 1. Calculate the maximum gain of two relations: gm and gn 2. Use taylor series to approximate: EXPERIMENTAL RESULTS 1. Data Sets: Saccharomyces Genome Database, a publication data set about the yeast organism Saccharomyces cerevisiae Three measurements: 2. Mean Reciprocal Rank (MRR): inverse of the rank of the first correct answer Mean Average Precision (MAP): the area under the Precision-Recall curve p@K: precision at K, where K is the actually number of relevant entities. NORMALIZED CUT Training data: Number of clusters ↑ Recommendation quality↑ Test data: NCut outperforms random HCD • Training data: • • HCD outperforms PRA in all three measurements Test data: • Two systems perform equally well FUTURE WORK Bottom-Up vs Top Down Improve Efficiency Type Recovery in Non-Labeled Graph Building an intelligent agent that simulates human-level learning using machine learning techniques A COMPUTATIONAL MODEL OF ACCELERATED FUTURE LEARNING THROUGH FEATURE RECOGNITION Nan Li Computer Science Department Carnegie Mellon University ACCELERATED FUTURE LEARNING Accelerated Future Learning Learning more effectively because of prior learning Has been observed a lot How? Expert vs Novice Expert Deep functional feature (e.g. -3x -3) Novice Shallow perceptual feature (e.g. -3x 3) A COMPUTATIONAL MODEL Model Accelerated Future Learning Use Machine Learning Techniques Acquire Deep Feature Integrated into a Machine-Learning Agent AN EXAMPLE IN ALGEBRA FEATURE RECOGNITION AS PCFG INDUCTION Under lying structure in the problem Grammar Feature Intermediate symbol in a grammar rule Feature learning task Grammar induction Error Incorrect parsing PROBLEM STATEMENT Input is a set of feature recognition records consisting of An original problem (e.g. -3x) The feature to be recognized (e.g. -3 in -3x) Output A PCFG An intermediate symbol in a grammar rule ACCELERATED FUTURE LEARNING THROUGH FEATURE RECOGNITION Extended a PCFG Learning Algorithm (Li et al., 2009) Feature Learning Stronger Prior Knowledge: Transfer Learning Using Prior Knowledge Better Learning Strategy: Effective Learning Using Bracketing Constraint A TWO-STEP ALGORITHM • Greedy Structure Hypothesizer: • Hypothesizes the schema structure Viterbi Training Phase: Refines schema probabilities Removes redundant schemas Generalizes Inside-Outside Algorithm (Lary & Young, 1990) GREEDY STRUCTURE HYPOTHESIZER Structure learning Bottom-up Prefer recursive to non-recursive EM PHASE Step One: Plan parse tree computation Most probable parse tree Step Two: Selection probabilities update s: ai p, aj ak FEATURE LEARNING Build Most Probable Parse Trees For all observation sequences Select an Intermediate Symbol that Matches the most training records as the target feature TRANSFER LEARNING USING PRIOR KNOWLEDGE GSH Phase: Build parse trees based on previously acquired grammar Then call the original GSH Viterbi Training: Add rule frequency in previous task to the current task 0.5 0.3 0.5 0.6 3 6 EFFECTIVE LEARNING USING BRACKETING CONSTRAINT Force to generate a feature symbol Learn a subgrammar for feature Learn a grammar for whole trace Combine two grammars EXPERIMENT DESIGN IN ALGEBRA EXPERIMENT RESULT IN ALGEBRA Fig.2. Curriculum one Fig.3. Curriculum two Fig.4. Curriculum three Both stronger prior knowledge and a better learning strategy can yield accelerated future learning Strong prior knowledge produces faster learning outcomes L00 generated human-like errors LEARNING SPEED IN SYNTHETIC DOMAINS Both stronger prior knowledge and a better learning strategy yield faster learning Strong prior knowledge produces faster learning outcomes with small amount of training data, but not with large amount of data Learning with subtask transfer shows larger difference, 1) training process; 2) SCORE WITH INCREASING DOMAIN SIZES The base learner, L00, shows the fastest drop Average time spent per training record Less than 1 millisecond except for L10 (266 milliseconds) L10: Need to maintain previous knowledge, does not separate trace into small traces INTEGRATING ACCELERATED FUTURE LEARNING IN SIMSTUDENT • Prepare Lucky for Quiz Level 3 ! A machine-learning agent that • Curriculum Browser x+5 Level 1: [+] One-Step Linear Equation • Level 2: [+] Two-Step Linear Equation Level 3: [-] Equation with Similar Terms Overview In this unit, you will solve equations with integer or decimal coefficients, as well as equations involving more than one variable. More… • Integrate the acquired grammar into production rules Lucky Tutor Lucky Next Problem Quiz Lucky Acquires production rules from Examples and problem solving experience Requires weak operators (non-domain specific knowledge) Less number of operators CONCLUDING REMARKS Presented a computational model of human learning that yields accelerated future learning. Showed Both stronger prior knowledge and a better learning strategy improve learning efficiency. Stronger prior knowledge produced faster learning outcomes than a better learning strategy. Some model generated human-like errors, while others did no make any mistake.