Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Research-Insight Providing Insight on Research by Publication Network Analysis Fangbo Tao, Kin Hou Lei, Yizhou Sun, Chi Wang, Tim Weninger, Jiawei Han. Motivation When doing research, what’s the confusing part that an “information system” may help? What’s your next research “big thing”? Who is the guy you should collaborate with? Which papers you need to read? Latest one? Related ones? Which papers you need to cite? Global insight? Personalized insight! Previous works, common affiliation, social connections, paper already read, etc… Data Source: CSR-Net DBLP Dataset An information-rich CS publication network Mining Web Information Hierarchical Affiliation Info: University-Department-Research Group Citation Data ArnetMiner & Citeseer Functions We Want Support Similarity Search Ranking-based Clustering & Classification Literature Recommendation Collaboration Prediction Academic Profile Generation Historical Affiliations Prediction Academic Family Discovery System Architecture Similarity Search Example Given an author, find his/her top-k similar authors and explain why (by showing the corresponding meta-path and similarity measure.). Compare the pathsim with other measures SimRank, Personalized-PagerRank. Potential Extension Finding top-k most related heterogeneous typed objects “Christos Faloutsos” related venues and terms Ranking-based Clustering & Classification Example Given a sub-network (DB, DM, IR, ML) and a desired number of clusters, perform clustering and show the top-k objects for each type. Do the same for the restricted network (DB). Classification is similar. Potential Extension User-provided constraints Specify one node belongs to a class/cluster Different Ranking rules Literature Recommendation Traditional keyword-based search system (G-Scholar) Measure the document similarity between query and paper Combine Network Structural Similarity & Document Similarity Academic History Research Community Reading Recommendation Example: If a young researcher has published 10 papers about two themes in three major confs. He wants know: Newly published paper along her/his research line Paper extending her/his research scope Classical papers along the theme but he has not cited Papers from same group/university for similar domain. Planned Solution Step1: Find a set of term clusters of a researcher’s work Step2: Find authors/venues/papers that are reputed in this themes Step3: Recommend based on freshness, topic closeness, influence of the paper, and structural closeness. Citation Recommendation Example Given a set of authors, title and abstract of a planned paper, return the papers should be cited. It may includes influential original papers and recent published related ones. Solution: Two step approach [Yu et al.]: Use a meta-path-based feature space to interpret structural information and build up discriminative term bucket for citation prediction. Further more, combine both document similarity and structural similarity. Collaboration Prediction For each researcher with his publication history and affiliation, one may get Advisor and group mates Other professor/student in the same institution in a related discipline Researchers in same field but different affiliations For each recommended relationship, explain the reason why such a prediction is made, showing weighted paths. Potential Extension Predict Collaboration given a specific research theme. Academic Family, Affiliation History Example Given a researcher (Jure Leskovec), we’ll present his current institution and likely time, as well as his historical institutions and academic family. Academic Family, Affiliation History Our solution’s based on a set of training data and a small set of rules. “Advisor has more publication and long history than advisee at the time of advising” “Once an advisee become advisor, will never become advisee” The training data comes from web mined current affiliations. Academic Family Affiliation History Iterative constraint propagation will help uncover the hidden affiliation history and academic family Roadmap Data Cube: Efficient Summary Highly Structured Data. Rich Text: Topic Analysis, query answer Common: ASRS, IMDB, Publication-Net, News… Network (HIN) Good at mining, contains structural information. No information loss One more thing: Rich text