Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data center wikipedia , lookup
Predictive analytics wikipedia , lookup
Operational transformation wikipedia , lookup
3D optical data storage wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
Data analysis wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
Jaebong Son 1-720-360-5057 855 W. Dillon Rd. Louisville F203 Boulder, Colorado, 80027 [email protected] [email protected] SUMMARY My challenge is to solve business problems that companies face in the Big Data era, with my expertise in computational linguistics, graph theory, business intelligence (BI), database design & implementation, and machine learning techniques. EDUCATION UNIVERSITY OF COLROADO, Leeds School of Business, Boulder, CO. Research Associate in Leeds School of Business, Aug. 2013 ~ Present UNIVERSITY OF ARIZONA, Eller College of Management, Tucson, AZ. Management Information Systems with minoring in Computational Linguistics, Completed 2nd year Ph.D. course, July 2012 UNIVERSITY OF ARIZONA, Eller College of Management, Tucson, AZ. Master of Science in Management Information Systems (Technical track), May 2010 KOREA UNIVERSITY, Seoul, Korea Master of Science in Management Information Systems, Feb. 2006 KONKUK UNIVERSITY, Seoul, Korea Bachelor of Arts in Business Administration and Management Information Systems, Feb. 2003, ANALYTICS SKILL SET Natural Language Processing (NLP) Conducted topic analysis (topic modeling) based on LDA (Latent Dirichlet Allocation) to find out “What have been going on?” phenomena by modeling news articles, blog posts, and forum threads Analyzed sentiments of movie reviews written in English and Korean based on SVM (Support Vector Machine) algorithm Built a NER(Named Entity Recognition) component to identify Person, Location, and Organization based on Stanford NLP parser Implemented a English statistical POS (Part-Of-Speech) parser based on Hidden Markov chain Model (HMM) and Viterbi algorithm Used probabilistic language models to extract key phrases from unstructured text data Graph Theory and Implementation Built a recommendation system based on graph theory and centrality measures such as Eigenvector, Betweenness, Closeness centrality, and etc. Implemented graph-based clustering known as community detection to find sub-graphs which have characteristics in common identified by relation with other nodes in graph Combined topic modeling with graph-based community detection to group similar set of topics which is called topic groups Business Intelligence (BI) Diverse experiences in Microsoft SQL Server with Analysis Service (SSAS) and Integration Service (SSIS) Highly skilled in implementing SQL Server objects such as tables, indexes, stored procedures, triggers, and functions Executed several content (unstructured data) analysis by leveraging Transact-SQL, text mining components of SSIS, and clustering algorithm of SSAS Implemented an automated web recommender system by inter-connecting SQL Server relational database engine (T-SQL), SSAS (Data Mining Expressions), and SSIS (Workflow) Familiar with dimensional modeling, relational modeling, star schema / snowflake schema, fact and dimensional tables Experienced in implementing machine learning algorithms such as K-means clustering and HITS (Hyperlink-Induced Topic Search) by Transact-SQL Programming language: Java, Python, and ASP.NET Highly skilled and specialized in processing texts using regular expressions Designed and implemented several Java-based crawling systems using SQL Server as repository Experienced in implementing machine learning algorithms such as Bayesian networks, Markov chain model, neural network, genetic algorithm, and so on. EXPERIENCE 07/2013~02/2014 IBM GBS, Seoul, Korea Freelancer, Managing Consultant, Advanced Analytics, Business Analytics and Optimization (BAO) Participating in risk modeling using user-generated content (text) in social media 09/2012-07/2013 IBM GBS, Seoul, Korea Managing Consultant, Advanced Analytics, Business Analytics and Optimization (BAO) Project leader (04/2013) of Samsung Electronics Big Data Platform project (POC) - Successfully achieved all tasks given by Samsung MSC Unit and delivered all results - Designed and implemented a buddy recommender system - Executed topic identification, extraction, and clustering based on movie synopsis - Produced a 2-mode network to identify topic-movie relation Project leader (09.2012~03.2013) of the LG U+ Big Data Platform project - Delivered a series of successful analyses of customers’ purchase & viewing behavior of HDTV content, browsing behavior of the Internet, and recommender system for HDTV content - Designed and implemented a recommendation module for HDTV contents for smartphone customers by leveraging structured and unstructured data - Designed integrated data model from diverse data sources - Implemented Social Network Analytics module for recommender system 01/2010-07/2012 Artificial Intelligence Laboratory, Tucson, AZ Research associate Project leader (10/2010~07/2012) of nanotechnology research funded by National Science Foundation (NSF award #0926270) Executed content analysis on nanotechnology patents to extract technology topics Carried out social network analysis to figure out inventors’ and assignees’ collaboration networks in nanotechnology and semiconductor industry Conducted competitor analysis on the semiconductor industry considering Taiwan Semiconductor Manufacturing Company (TSMC), Samsung Electronics, IBM, and Micron Inc. using patents issued with the USPTO (United States Patent and Trademark Office) (Reference: http://ai.arizona.edu/mis510/other/TSMC%20Patent%20Analysis.ppt) Designed and developed a large-scale data collection system to gather forum postings from politically unstable countries such as Afghanistan, Somalia, Lebanon, Yemen, and so on for research purposes (70 forums from 14 countries) 01/2006-12/2006 SPSS, Seoul, Korea Data Modeler, Data Mining Consultant, Consulting Division Executed two BI projects at Korea Exchange Bank (KEB) and Korean Transportation Safety Authority (KOTSA) Developed data mining and statistics components for BI systems based on Microsoft SQL Server, SPSS statistics package, and Clementine data mining software Analyzed business requirements to gather information for system design and analysis Participated in creating logical and physical data models to meet user and business requirements Developed SSIS packages to extract, transform, and load the customer data from the OLTP databases for data mining tasks Participated in writing a book about data handling using SPSS statistics package Lectured data handling using SPSS statistics package at SPSS Education Center 05/1997-07/1999 THE THIRD QUARTERMASTER CORPS OF KOREA, Kyongki, Korea Sergeant, IMO (Information Management Office) assistant Installed, upgraded, managed Microsoft SQL server 7 Developed and managed SQL objects such as tables, procedures and functions, and indexes. RESEARCH EXPERIENCE Jaebong Son et al., (2013). Global nanotechnology development from 1991 to 2012: patents, scientific publications, and effect of NSF funding. Journal of Nanotechnology Research (JNR). Jaebong Son et al., (2013). Nanotechnology Public Funding and Impact Analysis: A Tale of Two Decades (1991-2010), IEEE Nanotechnology Magazine, vol.7, issue 1, pp.914 Woo, J., Son, J., and Chen, H., (2011). An SIR model for violent topic diffusion in social media. 2011 IEEE International Conference on Intelligence and Security Informatics, Beijing, China, pp.15-19. Son, J. and Suh, Y. (2006). Using Degree of Match (DOM) to Improve Prediction Quality in Collaborative Filtering System. Information Systems Review (ISR). vol.8, issue 2, pp.139-154. INVITED SPEAKER CERTIFICATTION AND OTHERS Certification Risk Analyst, Entry Level, CNSS No.4016, 2009 Microsoft Data Base Administrator (MCDBA), 2003 Oracle Certified Professional (OCP), 2001 Competitive Landscape Analysis through Patent Analytics, ETRI (Electronics and Telecommunications Research Institute), May, 2012 Intellectual Property and Text Analytics: A Case of Patent Analytics, ETRI (Electronics and Telecommunications Research Institute), Dec, 2011 Additional Professional Education Advanced Statistical Analysis at SPSS Education Center (2 Courses), 2004 Microsoft MCDBA 2000 Course (4 Courses), 2001 Other Projects Participated in designing and developing data warehouse system at TUSD (Tucson Unified School District), 2010 Conducted cancer data analysis to identify the relationship of cancer, treatment, and cost using Microsoft SQL Server, WEKA, and Clementine, 2005 Executed query optimization for performance tuning of SQL Server 2000 at Epson Korea, 2004