Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Complex Social Network Mining —Theory, Methodologies, and Applications Jie Tang Department of Computer Science and Technology Tsinghua University Email: [email protected] 1 Social Networks Mobile Web (2008-20) Web 1.0 (1989) Connecting via mobiles… Pages, hyperlinks Relevance search Web 2.0 (2004) social networks Blogs, micro-blogs Web-based (or mobile-based) social networks already become a bridge to connect our real daily life and the virtual web space 2 Web-based Social Network Mining —Theory, Methodologies, and Applications 2 Search/query over networks 3 7 1 4 Social data integration Attribute/link prediction Social knowledge acquisition 6 5 Trust and privacy Social dynamics Social influence analysis Theoretical layer Collective learning Learning from users Web1.0: Web of Pages 3 Social theory Web 2.0: Web of People Graphical models Web 3.0: Web of Semantics Outline • ArnetMiner: Academic Social Network • Core Techniques – Knowledge Acquisition – Semantic Integration – Heterogeneous Ranking – Social Influence Analysis • Demo 4 ArnetMiner.org - Academic research social network analysis and mining system 提供全面的研究者网络分析与挖掘功能 Papers published: ACM TKDD, KDD’08-10, SDM’09, ICDM’07-09, CIKM’07-09, DKE, JIS http://arnetminer.org/ 5 Why Arnetminer.org? “The information need is not only about publication…” “Academic search is treated as document search, but ignore semantics” 6 Examples – Expertise search • When starting a work in a new research topic; • Or brainstorming for novel ideas. Researcher A • Who are experts in this field? • What are the top conferences in the field? • What are the best papers? • What are the top research labs? 7 Examples – Citation network analysis Researcher B • an in-depth understanding of the research field? An Inverted Index Implementation Introduction of Modern Information Retrieval Topics Filtered Document Retrieval with Frequency-Sorted Indexes Parameterised Compression for Sparse Bitmaps Memory Efficient Ranking Topic 31: Ranking and Inverted Index Topic 1 : Theory Topic 27: Information retrieval Topic 23: Index method Signature les: An access Method for Documents and its Analytical Performance Evaluation Topic 21: Framework Topic 34: Parallel computing Self-Indexing Inverted Files for Fast Text Retrieval Topic 22: Compression A Document-centric Approach to Static Index Pruning in Text Retrieval Systems Other Vector-space Ranking with Effective Early Termination Citation Relationship Type Efficient Document Retrieval in Main Memory 8 Static Index Pruning for Information Retrieval Systems Basic theory Comparable work Other Examples – Conference Suggestion authors Which conference should we submit the paper? Researcher C content 9 Examples – Reviewer Suggestion KDD Committee conference Paper content 10 Who are best matching reviewers for each paper? Our Social Network is Black Social network without role/relationship info, e.g. a company’s email network Latent relationship graph How to infer CEO Manager Employee Fortunately, user interactions form implicit groups 11 From BW to Color 12 3 2 1 13 Person Search Basic Info. Citation statistics Research Interests Social Network Fundings Publications 14 Expertise Search Finding experts, expertise conferences, and expertise papers for ―information retrieval‖ 15 Course Search Finding courses for ―data mining‖ 16 Association Search Finding associations between persons - high efficiency - Top-K associations Usage: - to find a partner - to find a person with same interests 17 Sub-Graph Search Sub graphs 18 Topic Browser 200 topics have been discovered automatically from the academic network 19 Academic Performance Measurement Academic Statistics Personal Statistics 20 Outline • ArnetMiner: Academic Social Network • Core Techniques – Knowledge Acquisition – Semantic Integration – Heterogeneous Ranking – Social Influence Analysis • Demo 21 ArnetMiner: Overview 2 3 Modeling and Search Network α β θ Φ A Social Network Analysis Social involution analysis Topic model T w ad x z c Nd μ ψ T D Academic suggestion Citation tracing analysis Expertise search Social influence analysis Association... cite Access interface write write Tree CRF... Indexing publish RNKB Storage publish ISWC Social Network Extraction cite 1 write EOS... publish Semantic... write Name Disambiguation Pc member Profiling extraction Publication extraction Homepage finding Data Sources Papers Homepage 22 ACM Papers DBLP Papers Libra citation Scholar publish write Limin Prof. Wangpublish cite Integration SVM... WWW cite publish Metadata Map-reduce--Distributed processing platform Dr. Tang Social Network Storage write Prof. Li coauthor Write write Annotation... coauthor IJCAI CT1: Knowledge Acquisition from Social Web (ACM TKDD, ISWC’06, ICDM’07, ACL’07, CIKM’07-08) Ruud Bolle 2 Office: 1S-D58 Letters: IBM T.J. Watson Information Research Center Contact P.O. Box 704 Ruud Bolle Office: 1S-D58 Yorktown Heights, NY 10598 USA IBM T.J. WatsonCenter Research Center Letters: Packages: IBM T.J. Watson Research Skyline Drive P.O. Box19704 Hawthorne, NY10598 10532USA USA Yorktown Heights, NY Email: Packages: [email protected] T.J. Watson Research Center 19 Skyline Drive Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's Hawthorne, NY 10532 USA Degree in Analog Electronics 1977 and the Master's Degree in Electrical Email: [email protected] Engineering in 1980, both from Delft University of Technology, Delft, The In 1983 he received Master'sEducational Degree in Applied Mathematics and in Ruud M.Netherlands. Bolle was born in Voorburg, Thethe Netherlands. He received thehistory Bachelor's the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode Degree 1984 in Analog Electronics in 1977 and the Master's Degree in Electrical Island. In 1984 he from became Research of Staff Member atDelft, the IBM Engineering in 1980, both Delfta University Technology, TheThomas J. Watson Research Center in the Artificial Intelligence Department the Computer Netherlands. In 1983 he received the Master's Degree in Applied of Mathematics andScience in In 1988Engineering he became from manager of University, the newly formed Exploratory 1984 theDepartment. Ph.D. in Electrical Brown Providence, RhodeComputer Vision whichaisResearch part of theStaff Math Sciences Department. Island. In 1984Group he became Member at the IBM Thomas J. Watson Research Center in the Artificial Intelligence Department of the Computer Science Currently, hishe research are onformed video database indexing, video Department. In 1988 becameinterests manager offocused the newly Exploratory Computer processing, visual interaction and biometrics applications. Vision Group which is part human-computer of the Math Sciences Department. video database indexing video processing visual human-computer interaction biometrics applications 1 Two questions: IBM T.J. Watson Research Center IBM T.J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 USA Research_Interest Research Staff Affiliation IBM T.J. Watson Research Center P.O. Box 704 Address Yorktown Heights, NY 10598 USA Address http://researchweb.watson.ibm.com/ ecvg/people/bolle.html Position Homepage Photo Name Ruud Bolle [email protected] Email • How to accurately extract the researcher 1 profile information from the Web? services • How toAcademic integrate the information from different sources? Ruud M. Bolle interests is a Fellow the IEEE thedatabase AIPR. Heindexing, is Area Editor Currently, his research areoffocused onand video video of Computer Vision andhuman-computer Image Understanding and Associate Editor applications. of Pattern Recognition. Ruud processing, visual interaction and biometrics M. Bolle is a Member of the IBM Academy of Technology. Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud M. Bolle is a Member of the IBM Academy of Technology. 1984 Brown University Electrical Engineering 2006 Msuniv Delft University of Technology Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint 49 EE Representation Using Localized Texture Features. ICPR (4) 2006: 521-524 2 Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle: 48 EE Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006) Electrical Engineering Applied Mathematics Co-author 47 46 ... 23 Co-author Publication 2# Publication 1# Title Title Cancelable Biometrics: A Case Study in Venue Fingerprints Date End_page Start_page ICPR 2005 1Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior: EE The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20 Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha: EE 2 Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116 Bsdate 1977 Bsuniv Delft University of Technology Bsmajor Msmajor Msmajor 1Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics: 50 EE A Case Study in Fingerprints. ICPR (4) 2006: 370-373 Ruud Bolle Analog Electronics 1980 Publications DBLP: Ruud Bolle Phddate Phduniv Phdmajor Msdate 370 Fingerprint Representation Using Localized Texture Features Venue End_page Start_page 2006 2006 521 ICPR 373 coauthor Publication #3 affiliation 524 UIUC Ruud Bolle 2 Publication #5 ... Date coauthor position Professor Researcher Network Extraction 70.60% of the researchers have at least one homepage or an introducing page Research_Interest Fax Affiliation Title 85.6% from universities 14.4% from companies Start_page 71.9% are homepages End_page 40% are in lists and tables 28.1% are introducing pages 60% are natural language text Phone Postion Publication_venue Address Person Photo Email Homepage Publication Name Authored Coauthor Researcher Bsdate Bsuniv Phddate Phduniv Phdmajor Msdate Bsmajor Msuniv Msmajor Date There are a large number of person names having the ambiguity problem Even 3 ―Yi Li‖ graduated from the author’s lab 70% moved at least one time 24 Our Approach Picture – based on Markov Random Field Markov Property: Ya P(Yi | Y j | Y j Yi ) Yb Yc Special cases: P(Yi | Y j | Y j ~ Yi ) - Conditional Random Fields - Hidden Markov Random Fields Ye Yd Yf y4=2 y9=3 t -coauthor y7=2 y1=1 cite coauthor y10=3 y5=2 y6=2 co-conference y3=1 y2=1 coauthor cite coauthor co-conference cite y11=3 y8=1 coauthor x4 x9 x7 x1 Researcher Profiling 25 x5 Name Disambiguation x3 x6 x2 x11 x8 x10 CT2: Semantic Integration (IEEE TKDE, SIGMOD’09, IJCAI’09, ISWC’09) 26 RiMOM-A Tool for Semantic Integration (OAEI’06-09) Benchmark Results 1 0.5 0 Precsion Recall F-measure 1 0.8 0.6 0.4 0.2 0 Anatomy Results Precision Recall Recall+ F-measure agrafsa Subtrack “I’m really surprised byResults the 1 good 0.8 results of these years 0.6 RiMOM, you can compete withPrecision 0.4 http://keg.cs.tsinghua.edu.cn/project/RiMOM/ the0.2 top systems that make use 0 of such background knowledge.‖ 27 CT3: Topic-based Heterogeneous Ranking (Machine Learn. J, KDD’08, ICDM’08, CIKM’09, DKE) Data mining Modeling using VSM ery Qu tor vec Search with keyword 0 0 Principles of Data Mining. DJ Hand - Drug Safety, 2007 - drugsafety.adisonline.com Doc1 r vecto 1 1 1 0 1 0 1 1 Doc3 1 0 1 0 1 0 vector 0 1 1 1 Do 1 1 vectoc4 Return Advances in Knowledge Discovery and Data Mining UM Fayyad, G Piatetsky-Shapiro, P Smyth, R… Data Mining: Concepts and Techniques J Han, M Kamber - 2001… r Search with semantic modeling Experts Expertise conferences Modeling using semantic topics Topics Data mining Data mining 0.4 Association Rules 0.2 Database systems 0.15 0.1 Data management 0.05 0.02 Web databases Information systems 28 Return Expertise papers Challenges write write 1. How to model the heterogeneous academic network? Cite Cite Cite write write ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Co-author Co-author Co-author Co-author Co-author Co-author Co-write Co-write Cite Cite Cite ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Co-write Co-write 2. How to capture the link information for ranking objects in the academic network? 29 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Cite Cite Cite ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- write write publish publish publish PC PC member member PC member chair chair chair publish publish publish publish publish publish Cite Cite Cite Modeling the Academic Network α β θ Φ A θ Φ AC T α x z c Nd μ T D z c x T z w Nd w Nd c D D ψ η,σ2 T ACT1 conference ACT2 Author-Conference-Topic Model [Tang et al., 08] 30 Φ A ad x β θ ad w Topic β words authors ad α ACT3 Integrating Topic Model into Random Walk Random walk over the academic network Modeling academic network with topics write Cite write ------------------------------------------------------------------------------------------------------------------------ Co-author ------------------------------------------------------------------------------------------------------------------------- Cite Co-write Co-author Cite + -------------------------------------------------------------------------------------------------------------------- Co-write ------------------------------------------------------------------------------------------------ write publish publish chair PC member publish Author-Conference-Topic Model [Tang et al., 08] 31 Cite Combination Method 1 Author Graph Ge Prof. Tang Prof. Wang λdd Conference Graph Gc IJCAI Stage 1: Random walk ISWC Jing Zhang λed λde Paper Graph Gp λcd Ranking score WWW EOS... Tree CRF... Association... λdc Combination by multiplication Topic layer Data mining Stage 2. Topic-based relevance Topic-based relevance score Query ISWC IJCAI WWW ... EOS... Tree CRF... Association... Prof. Tang Prof. Wang Jing Zhang 32 ... Combination Method 2 Author Graph Ge Prof. Tang Prof. Wang Conference Graph Gc λdd Ranking score ISWC IJCAI Jing Zhang λed λde Paper Graph Gp λcd WWW EOS... Tree CRF... Association... λdc Transition probability λdt λtd Hidden Theme Graph Gt pos Web service λqt Query: ontology alignment 33 owl λtq Learning to Rank Experts • Combining more information Empirical loss weight feature 2 a b min 1 zTi wT , xTi xTi wT 2 wT i 1 Model penalty n2 Language model, BM25, tf*idf 34 Heterogeneous Cross-domain Ranking Query: “data mining” Papers Principles of Data Mining Conferences Data Mining: Concepts and Techniques KDD ? SDM ? Authors ICDM ? author/ paper conf P. Yu PAKDD ? ? Loss in one domain Loss in another domain min 1 z Si w S , x Sai x Sbi C 1 zTi wT , xTai xTbi W w S , wT i 1 i 1 n1 Common feature space n1 min 1 z Si w S , U T x Sai x Sbi w S , wT ,U i 1 35 n2 n2 C 1 z w , U T x a x b Ti T Ti Ti i 1 2,1 2 W 2 ,1 2 Learning Algorithm • Equivalent objective function: Optimize the loss function for each domain Common space discovery Optimize the weight via the common space 36 Experimental Results • Data sets – Homogeneous Data • LETOR 2.0: TREC2003, TREC2004, and OHSUMED – Heterogeneous Data • Academic network consisting of 14,134 authors, 10,716 papers, and 1,434 conferences. – Heterogeneous Tasks • Expert finding vs. Bole search • Baselines – RSVM – Language model 37 Results on Homogeneous Data 38 Results on Heterogeneous Data 39 Results on Heterogeneous Tasks • Expert finding verse Bole search (finding best supervisor) • To obtain ground truth of bole for each query – We sent emails to 50 senior researchers and 50 junior researchers (91.6% are post doc or graduates) – Average their feedbacks 40 CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS) • How to quantify the influence between users? • What is the relationship between users? • How to discover topic distribution over links? Dr. Tang Association... cite • Can we predict the user’s actions? SVM... publish write write publish WWW Tree CRF... publish write Limin cite publish ISWC cite Prof. Wangpublish cite publish write EOS... Semantic... write Annotation... write Pc member write 41 Prof. Li coauthor Write coauthor IJCAI Topic-based Social Influence Analysis • Social network -> Topical influence network 42 Social Influence Sub-graph on ―Data mining‖ 43 Influential nodes on different topics 44 CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS) • How to quantify the influence between users? • What is the relationship between users? • How to discover topic distribution over links? Dr. Tang Association... cite • Can we predict the user’s actions? SVM... publish write write publish WWW Tree CRF... publish write Limin cite publish ISWC cite Prof. Wangpublish cite publish write EOS... Semantic... write Annotation... write Pc member write 45 Prof. Li coauthor Write coauthor IJCAI Mining Advisor-Advisee Relationship from Research Publication Networks Input:T em poral collaboration netw ork O utput:R elationship analysis 1999 A da 22000 000 (0.9,[/,1998]) A da B ob (0.4, [/,1998]) (0.5,[/,2000]) 2000 2001 Y ing 2002 Sm ith (0.8,[1999,2000]) Jerry (0.7, [2000,2001]) B ob Y ing 2003 2004 46 (0.65,[2002,2004]) Sm ith (0.49, [/,1999]) Jerry (0.2, [2001,2003]) V isualized chorologicalhierarchies Results 47 Application: visualization RULE 48 TPFG Bole Search In Arnetminer An example on a real system: Arnetminer 49 Performance improvement Results (cont.) 50 CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS) • How to quantify the influence between users? • What is the relationship between users? • How to discover topic distribution over links? Dr. Tang Association... cite • Can we predict the user’s actions? SVM... publish write write publish WWW Tree CRF... publish write Limin cite publish ISWC cite Prof. Wangpublish cite publish write EOS... Semantic... write Annotation... write Pc member write 51 Prof. Li coauthor Write coauthor IJCAI Examples – Topic distribution analysis over citations Researcher A • an in-depth understanding of the research field? An Inverted Index Implementation Introduction of Modern Information Retrieval Topics Filtered Document Retrieval with Frequency-Sorted Indexes Parameterised Compression for Sparse Bitmaps Memory Efficient Ranking Topic 31: Ranking and Inverted Index Topic 1 : Theory Topic 27: Information retrieval Topic 23: Index method Signature les: An access Method for Documents and its Analytical Performance Evaluation Topic 21: Framework Self-Indexing Inverted Files for Fast Text Retrieval Topic 34: Parallel computing VS. Topic 22: Compression A Document-centric Approach to Static Index Pruning in Text Retrieval Systems Other Vector-space Ranking with Effective Early Termination Citation Relationship Type Efficient Document Retrieval in Main Memory Semantic citation network 52 Static Index Pruning for Information Retrieval Systems Original citation network Basic theory Comparable work Other Problem: Link Semantic Analysis Citation context words Link semantics 53 Topic modeling over links Pairwise Restricted Boltzmann Machines (PRBMs) Link category Latent variables defined over the link to bridge the two pages Topic distribution Example 54 Link context words Pairwise Restricted Boltzmann Machines (PRBMs) Accuracy of Link Categorization Tested on Arnetminer citation data gPRBM: our approach with generative learning dPRBM: our approach with discriminative learning hPRBM: our approach with hybrid learning 55 CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS) • How to quantify the influence between users? • What is the relationship between users? • How to discover topic distribution over links? Dr. Tang Association... cite • Can we predict the user’s actions? SVM... publish write write publish WWW Tree CRF... publish write Limin cite publish ISWC cite Prof. Wangpublish cite publish write EOS... Semantic... write Annotation... write Pc member write 56 Prof. Li coauthor Write coauthor IJCAI What can we do in SNS? 57 Social Action Twitter Flickr KDD Add favorites Comment on Haiti Earthquake 58 Publish in KDD Conference Action Comment on Haiti Earthquake Time t Time t+1 1. Always watch news 2. Enjoy sports 3. … 59 Attribute Results 60 Arnetminer Today — A brief summary 61 ArnetMiner’s History • 2006/5, V0.1 Perl-based CGI version – Profile extraction, person/paper/conf. search • 2006/8, V1.0 Java (Demo @ ASWC) – Rewrite the above functions • 2007/7, V2.0 (Demo @ KDD, ISWC) – New: survey search, research interest, association search • 2008/4, V3.0 (Demo @ WWW) – Query understanding, New search GUI, log analysis • 2008/11, V4.0 (Demo @ KDD, ICDM) – Graph search, topic mining, NSFC/NSF • 2009/4, V5.0 (Demo @ KDD) – Bole/course search, profile editing, open resources, #citation • 2009/12, V6.0 – Academic statistics, user feedbacks, refined ranking • V7.0, coming soon – Name disambiguation, reviewer assignment, supervisor suggestion, open API 62 ArnetMiner Today * Arnetminer data: > 0.6 M researcher profiles > 3M papers > 17M citation relationships > 5K conferences > 50M logs Messages from Users … I’ve happened to visit your Arnetminer, and shocked. It was really impressive, its usefulness and your works!!! … [from …@selab.snu.ac.kr] …I would first of all congratulate you on the excellent work you have done in Arnetminer and I am much inspired… [from …@nu.edu.pk] a survey by UK Southampton … Arnetminer is one of my favorite tools to find * Visits come from more than 190 Title: Semantic Technologies for Learning and Teaching in Web 2.0. [from …@qlink.com] folk and academic relatives… Top 10 countries — Thanassis Tiropanis, Hugh Davis, Dave Millard, Mark Weal countries 1.outside Canada Dr. JieUSA Tang, …Exposing the expertise of the institution to the world in order6. to attract * Continuously +20% increase of Dear our example papers (http://www.waset.org ) 2.include China Japan funding students. ArnetMiner is the Can mostyou representative of7.such tools visits per and month in your 3. Arnetminer?... at the moment… Germany[from …@waset.org] 8. Taiwan 4.across 9. France .. top top! I India am very interested in your ArnetMiner. Contextualised queries and searches, repositories potentially in * >100,000 page views per day searches that possible give me a bit of10. yourItaly social 5. of UK different departments or institutions, andIsmatching people for collaborative data… activities. Best example of the surveyed network technologies to[from this …@cse.ust.hk] end is ArnetMiner. 63 Opportunity: exploiting semantic web and social network in the real-world Web, relational data, ontological data, social data Data Mining and Social Network techniques Scientific Literature Social search & mining Advertisement Mobile Context Energy trend analysis Large-scale Mining Users cover >180 countries >600K researcher >3M papers Social network Extraction Social network Mining Advertisement Recommendation Mobile search & recommendation Energy product Evolution Techniques Trend Scalable algorithms for message tagging and community Discovery Arnetminer.org IBM Sohu Nokia Oil Company Google 科技信息资源内容监测与分析服务平台 (中国科技部信息情报研究所) Search, browsing, complex query, integration, collaboration, trustable analysis, decision support, intelligent services, 64 Arnetminer PatentMiner 65 ScopusMiner PubmedMiner CheMiner Representative Publications • • • • • • • • • • • • Jie Tang, Jing Zhang, Ruoming Jin, Zi Yang, Keke Cai, Li Zhang, and Zhong Su. Topic Level Expertise Search over Heterogeneous Networks. Machine Learning Journal. Jie Tang, Limin Yao, Duo Zhang, and Jing Zhang. A Combination Approach to Web User Profiling. ACM TKDD, 2010. Juanzi Li, Jie Tang, Yi Li, Qiong Luo. RiMOM: A Dynamic Multi-Strategy Ontology Alignment Framework. IEEE TKDE, 2009. Chenhao Tan, Jie Tang, Jimeng Sun, Quan Lin, and Fengjiao Wang. Social Action Tracking via Noise Tolerant Time-varying Factor Graphs. KDD’10. Chi Wang, Jiawei Han, Yuntao Jia, Duo Zhang, Yintao Yu, Jie Tang, Jingyi Guo. Mining Advisor-Advisee Relationships from Research Publication Networks. KDD’10. Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. Social Influence Analysis in Large-scale Networks. KDD'09. Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. KDD’08. Jie Tang, Hang Li, Yunbo Cao, and Zhaohui Tang. Email Data Cleaning. KDD’05. Jie Tang, Ho-fung Leung, Qiong Luo, Dewei Chen, and Jibin Gong. Towards Ontology Learning from Folksonomies. IJCAI’09. Qian Zhong, Hanyu Li, Juanzi Li, Guotong Xie, Jie Tang, Lizhu Zhou. A Gauss Function based Approach for Unbalanced Ontology Matching. SIGMOD’09. Feng Shi, Juanzi Li, Jie Tang. Actively Learning Ontology Matching via User Interaction. ISWC’09. Chonghui Zhu, Jie Tang, Hang Li, Hwee Tou Ng, and Tiejun Zhao. A Unified Tagging Approach to Text Normalization. ACL’07. Others: ICDM’07-09, CIKM’07-09, SDM’09, ISWC’06, DKE, JIS, etc. 66 Thanks! Demo: http://arnetminer.org HP: http://keg.cs.tsinghua.edu.cn/persons/tj/ 67