Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Research Profiling – Using VantagePoint to characterize a body of research publications: • • • A series of short presentations (“podcasts”) Mining Web of Science data Case example: nano-enhanced, thin-film solar cells Cells • Nano-enhanced Solar Cells Alan Thin-film Porter Director of R&D, Search Technology, Inc. [& Georgia Tech] [email protected] Pod 1: Overview of Research Profiling & Getting data from Web of Science Research Profiling 1. Overview of the general process & getting data 2. Data into VantagePoint & cleaned 3. Basic descriptors + (tentatively): a) b) c) d) e) Trends Topical emphases & Changes Influence Measures Research Networking: Maps Locating a body of research: science & geo maps f) Super Profiling: Breakouts g) Advanced Analyses Session Strategy A. ~10 minutes per session – sequential, but you can skip to topics of interest after the introduction B. Aim: To stimulate your ideas on how to apply VantagePoint to gain insights from sets of research publications C. This first set of sessions keys on Web of Science (“WOS”) results with a technology topic search focus – i.e., “what?” D. A future set will key on WOS search results based on searching on a given organization – i.e., a “who?” focus E. Case example: Nano-enhanced Solar Cells [with special thanks to Ying Guo] 5 Stages in Mining External R&D Knowledge 1. Literature review (within research community) 2. Research Profiling: Characterizing a body of research publication activity • Focus on research activities • Largely descriptive 3. Tech Mining • Multiple data to mine • To generate effective technical intelligence 4. Structured Knowledge Discovery 5. Literature-Based Discovery (“LBD”) Research Profiling 1: Getting Going A. General overview of the Research Profiling process and its aims Questions Answers Data B. Search; download How to do Tech Mining (or Research Profiling): 8 steps 1. Spell out the questions and how to answer them 2. Get suitable data 3. Search (iterate) 4. Import into text mining software (e.g., VantagePoint) 5. Clean the data 6. Analyze & interpret 7. Represent the information well – communicate! 8. Standardize and semi-automate where possible Start with the questions! Types of Questions Text and data mining techniques are good at addressing: WHO? WHAT? WHEN? WHERE? Additional questions usually require more human insight: HOW? WHY? “Answers”: Innovation Indicators • Technology Life Cycle Indicators - e,g, growth curve location & projection • Innovation Context Indicators - e.g., presence or absence of success factors (funding, standards, infrastructure, etc.) • Product Value Chain and Market Prospects Indicators - e.g., applications, sectors engaged Six information types Technical Information • Science, Technology & Innovation (“ST&I”) Databases (e.g., Web of Science; CSCD, Thomson Innovation) • Internet Sources (e.g., Googling) • Technical Expertise Contextual Information • Business, competition, customer, policy, popular content Databases (e.g., Thomson One) • Internet Sources (e.g., blogs, website profiling) • Business Expertise On-line Data Sources Cambridge Scientific Abstracts Delphion Dialog EBSCOHost Ei Engineering Village Custom Data Factiva ISI Web Of Knowledge Lexis Nexis Micropatent Ovid Patbase Questel-Orbit SilverPlatter STN Thomson Innovation Databases Aerospace Art Abstracts Biobase Biological Abstracts Biological Sciences Biosis Biotechno Business & Industry CAPlus (AnaVist export) Cassis CBNB Claims Computer & Info Systems Corrosion Current Contents Derwent Biotech Abstracts Derwent Innovations Index Derwent World Patent Index Ei Compendex EMBase EnCompass Literature EnCompass Patents Energy EnergySciTech Engineering Materials Abstr Envr Sci & Pollution Mgmt ERIC EuroPat FamPat Comma/tab delimited tables Microsoft Excel and Access SmartCharts XML Record/Field Tools Focust Food Sci & Tech Foodline Market Foodline Science Forege Frosti FSTA Gale PROMT GeoRef Global Reporter IFIPAT IFIUDB INPADOC INSPEC IPA ISD ITRD JAPIO JICST Kosmet LGST MATBUS Medline METADEX Mgmt and Org Studies Micropatent Materials Mobility NSF Awards NTIS Pascal Patent Citation Index PCT PCTPAT Phin Pira Pluspat PROMT PsycINFO PubMed Rapra Recent Refs Reference Manager Science Citation Index SciSearch Scopus Tech Research ToxFile Transport USApps USPat Waternet WaterResAbs Web of Science WeldaSearch Wisdomain Combine duplicate records Remove duplicate records Create “frankenrecords” (merge records from dissimilar sources) Classify records Merge fields Clean up fields Apply thesauri A wealth of diverse information sources for innovation management VantagePoint Import Filters and Tools Management Issues Requires Access to External Information (License) • Bulk Processing is a must • Download in electronic form • Requires competence in searching Case Examples Getting to the data - usually via internet Case Examples Getting the data - search within databases Case Examples Retrieving the data Resources • www.theVantagePoint.com – offers multiple papers and some case analyses • View the VantagePoint Video Tutorial Series by Paul Oldham on the website, especially Sessions 1, 2 & 3 • Tech Mining by Alan Porter and Scott Cunningham, Wiley, 2005. • Porter, A.L., Kongthon, A., Lu, J-C., Research Profiling: Improving the Literature Review, Scientometrics, Vol. 53, p. 351-370, 2002. Pod 2: Cleaning the Data in VantagePoint Research Profiling 1. Overview of the general process & getting data 2. Data into VantagePoint & cleaned 3. Basic descriptors + (tentatively): a) b) c) d) e) Trends Topical emphases & Changes Influence Measures Research Networking: Maps Locating a body of research: science & geo maps f) Super Profiling: Breakouts g) Advanced Analyses Getting the data into VantagePoint 1. 2. 3. 4. 5. Open VantagePoint File > Import Raw Data File Import Wizard opened: Select Files Select a suitable import filter > Next Select fields to import - maybe Secondary Fields too - you can later “import more fields” Case Examples Summary Sheet VPT file - Fields available - Counts - Coverage of record set “Right-Click” to - set data type - rename - view statistics - etc. Search Refinement • Confirm your search boundaries: time, geographical, institutional • Check your search quality Precision – how much noise did you retrieve? Recall – what did you miss? • Check in VantagePoint Are you finding researchers and organizations you expect? Topical inclusion – especially check key terms – Keywords (authors) – Keywords Plus (based on recurring phrases in the titles of papers referenced by the documents you’ve retrieved) – Title NLP (Natural Language Processing) phrases – Or a combination of these (use “Merge Fields”) You may well identify terms to try out in your WOS search • Ask knowledgeable technical folks to review and advise • Redo your search and download Data Cleaning • Just pointers here • Fields > List Cleanup – Window opens Select field Select “.fuz” to apply: e.g., – – – – Organization Names.fuz Person Names.fuz General.fuz BritishAmericanSpelling.fuz Option: Verify matches w/another Field [e.g., Person Names with Author Affiliation] • Fields > Thesaurus – Window opens Select field Select “.the” to apply: e.g., provided by Search Technology: – Country.the – AcadCorpGov.the Or select custom thesauri: e.g., – Azerbaijan Natl Acad Sci name variations in WOS.the Whew! • Remember to check your search coverage. • Redo a refined search as needed • Import and clean your data as warranted • And the next podcast will get us into Research Profiling! • Basic Descriptors coming up next Pod 3: Dealing with single fields: Getting set to work with Lists Research Profiling 1. Overview of the general process & getting data 2. Data into VantagePoint & cleaned 3. Basic descriptors + (tentatively): a) b) c) d) e) Trends Topical emphases & Changes Influence Measures Research Networking: Maps Locating a body of research: science & geo maps f) Super Profiling: Breakouts g) Advanced Analyses Research Profiling Segment 3: “Basic descriptors” A. Data prep – getting the target fields (variables) all set B. “Top N” lists and such [single field tallies across the record set] Nano-enhanced Thin-film Solar Cells Analysis of Global Research Activities with Future Prospects Ying Guo Ph.D. Candidate, Beijing Institute of Technology Visiting Student, Georgia Institute of Technology Alan L. Porter Lu Huang International Association for Management of Technology, 2009 Data Prep (1) 1. If you have refined your search, re-import 2. Clean -- as suitable to meet your objectives, for basic descriptors, especially check: a. Publication Years [year.the available, but Web of Science data are usually clean] b. Countries [apply country.the] c. Affiliations [organization names.fuz] d. Authors [person names.fuz; potentially “verify matches with another field” – use Affiliations to help disambiguate names] 3. If you are apt to deal with a topic in the future, save List Cleanup results as your own topical thesaurus. Data Prep (2) 1. Topical fields a. b. 2. Make Macro-disciplines from Subject Categories [not a standard VP thesaurus, but we plan to make available on our new academic website] Keywords: decide if you want to MERGE some combination of: Keywords (author’s) & Keywords Plus & Title (NLP) phrases & Abstract (NLP) phrases Keyword Clumping options a. b. Human: Scan the combo Keywords field of choice; make groups of interesting terms using FIND Statistical: After a little pre-cleaning, use Factor Mapping to form groups of the top %’s [e.g., 1%, 2%, 5% of records]; examine their performance; pick the best level to get at topical emphases Top N’s 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. (Document types) (Publication Years) (Times Cited) Countries Affiliations Funding agencies Authors Journals (or Sources) Key terms Subject Categories Macro-Disciplines Organization Types Top N’s 1. Pick your output venue(s) – e.g., in VP and/or MS Excel, Word, Powerpoint 2. Decide if normalization is in order a. % of All (or something else) b. Across databases or datasets c. Table or Figure DONE! Research Profiling Segment 3: “Basic descriptors” A. Data prep – getting the target fields (variables) all set B. “Top N” lists and such [single field tallies across the record set] Fields from the dataset Derived fields Up next in Segment 4: • • • 2 Fields together (matrices) Trends Discerning “Hot and New” topics Pod 3+: VP Help & Interactions/Exercises Research Profiling – Using VantagePoint to characterize a body of research publications: • • • A series of short presentations (“podcasts”) Mining Web of Science data Case example: nano-enhanced, thin-film solar cells Cells • Nano-enhanced Solar Cells Alan Thin-film Porter Director of R&D, Search Technology, Inc. [& Georgia Tech] [email protected] Help! 1. VantagePoint Help 2. Analyst’s Guide Interacting 1. Discuss uses of VantagePoint to answer your research profiling questions If you are together in a real or virtual group, discuss materials presented Here’s a starter question (next slide) 2. Perform hands-on exercises Interactive Ideas/Exercises 1. What “MOT” (management of technology, or technology policy, or research opportunity) questions might you want to answer from a Web of Science dataset? [next slides illustrative] IAMOT 2009 For S&T Policy Maker and Manager: • What are national R&D strengths and weaknesses? • What is the existing status and what about forecasting likely future developments for thin-film solar cells? • How to gauge relative opportunities for collaborative development, as well as monitor emerging competitors? Who MOT What When By Data Mining Technology Where Why How Global Research Activities with Our Paper Future Prospects Need more experts’ inputs (we’re working on this) IAMOT 2009 We look at: 1. What research fields are involved?---map of science 2. quantity---publication numbers and trends 3. diversity---national contrasts 4. quality---citations 5. patterns of research networking---using VantagePoint 6. “Hot” nano-materials For data: a global dataset of nano publications downloaded from the SCI Basic Dataset defined “thin film and (solar or photovoltaic)” as our search expression Search Expression acquired the dataset containing 1659 records for time period from 2001 to mid-2008 Result Dataset Interactive Ideas/Exercises 2. Search on a topic with colleagues; consider how to refine your search • • • Import preliminary search results into VP [do you have the right import filter?] Scan key terms, Subject Categories, etc. to check coverage and identify ways to enhance your search Refine and rerun the search if warranted and time permits Interactive Ideas/Exercises 3. Given your MOT questions, what data cleaning is in order? • • • Step through cleaning actions for each key field Apply suitable “List Cleanup” (using appropriate “.fuz” files) Apply thesauri as suitable (“.the” files) Interactive Ideas/Exercises 4. A possible exercise: Thesaurus enhancement • • • • • Run the AcadCorpGov.the on your cleaned Affiliations field [get rid of existing groups] On that resulting field, “Create Group Using Thesaurus” using this same “.the” file. Select “Group for Each Alias.” Research (e.g., Google) & assign some of the multiply-occurring organizations to one of the 4 groups. “Create thesaurus using groups”; select all 4 groups; save as AcadCorpGov-new date.the Run it as thesaurus; run it to create groups. Interactive Ideas/Exercises 5. A Web of Science Key Terms exercise • • • • • • Merge fields (candidates include Keywords-Author; KeywordsPlus, Title NLP phrases; Abstract NLP phrases) Apply general.fuz Apply stopwords.the Make your own “interesting” key terms set • Scan for an interesting term; use FIND with “select all” and make a GROUP of variations of that term • Repeat for several interesting terms, making more groups • Create a new Field from Group Names Use Factor Map to statistically make a key terms set • Make a group in the Key Terms field – selecting interesting terms appearing in, say, >1% of the records • Run Factor Map – then check out the resulting term grouping (in a new Key Terms field created) Compare the two key term sets – either useful? Interacting 1. We’ll insert more candidate exercises as we proceed, without great elaboration – use as you choose 2. Now, back to the show Pod 4: Matrices Nano-enhanced Solar Cell Web of Science Subject Category Concentrations of the Leading Countries USA India Germany Japan China Materials Science, Multidisciplinary 126 132 83 68 63 Physics, Applied Physics, Condensed Matter Chemistry, Physical Energy & Fuels Materials Science, Coatings & Films 112 59 82 26 24 92 80 28 16 26 68 47 34 9 17 53 46 32 10 21 56 72 26 49 21 Acad-Corp-Gov Publishing by Country Cross-national Collaboration USA India Germany Japan China France UK South Korea Mexico Spain % International Cooperation (among top 10) USA 20.1% 26.4% 27.1% India Germany Japan China France UK South Mexico Spain Korea 288 5 5 239 16 4 16 4 195 5 15 10 6 5 4 8 3 5 8 9 20 1 8 10 24.2% 10.4% 24.8% 34.5% 52.2% 5 6 5 3 9 15 10 2 8 8 1 182 4 2 5 2 4 182 2 2 1 2 2 113 4 5 2 4 84 1 2 1 1 2 38.5% 17.5% 8 1 10 1 2 4 5 20 4 2 1 69 2 3 1 1 4 3 1 2 65 2 2 63 Matrix-related Topics covered in VantagePoint • Matrix Viewer • Multiple visualizations available • Activity-Diversity • Scattergram for one variable based on 2 others • Aduna Clustering • • Colorful visualization of intersecting sets (e.g., coauthoring) Capability to zoom to records at those intersections (extending to >2-way connections) Pod 5: Trends Trends 1. Decide if normalization is in order a. Over time [rate of change] b. Most recent year 2. Decide if comparative analyses are in order a. What/who are the benchmarks? b. How do you want to present your results? DSSC research by organization type (from SCI) # of author affiliations/paper for DSSC publications (SCI) Nano-Structured ZnO Thin-film Solar Cells Publication by Countries and Years 14 China 12 India 10 Japan USA 8 Mexico 6 Germany China and India are notable! 4 China Japan 2 Mexico South Korea 0 2001 2002 2003 2004 2005 2006 France 2007 South Korea Spain France IAMOT 2009 100% 90% France 80% Spain 70% South Korea Germany 60% 50% Mexico 40% USA 30% Japan India 20% China 10% 0% 2001 2002 2003 2004 2005 2006 2007 Nano-Structured ZnO Thin-film Solar Cells Publication: Top 10 countries by Years – note the increasing share for India & China DSSC Publications (SCI) with % 2006 or later Share of Nano-enhancedThin-film Solar Cells Publications by Countries [Science Citation Index, 2001-08 (part-year)] 0.25 0.2 2001 0.15 2003 2005 0.1 2007 0.05 0 USA India Germany Japan China Projecting Nano-enhanced Solar Cell Research Activity Actual data Projected data Research activity and impact characteristics—First Way qualit y -# of c it at ions 2000 USA 1500 1000 UK 500 0 0 South Korea Mexico France Spain 50 100 150 Japan Germany India China 200 250 300 350 ac t iv it y -# of rec ords • Nodes above the diagonal suggest relatively higher quality (US and UK). Below the diagonal, the closer to the diagonal, the higher the quality of that country’s research. Research activity and impact characteristics—Second Way 200 # of A ged* Cit at ions , 2001 and 2006 Year denoted by s tar t and end 180 points • The steeper the slope of US 2006 2001 the line connecting these 160 two points, the greater the 140 increase in quality of the 120 country’s research on this 100 topic 80 • Compared with Japan and 60 Germany, China and India India China 40 are upgrading! Germany 20 J apan 0 0 10 20 30 40 # of Rec ords , 2001 and 2006 50 60 Pod 6: “Hot topics” Research Profiling – Using VantagePoint to characterize a body of research publications: • • • A series of short presentations (“podcasts”) Mining Web of Science data Case example: nano-enhanced, thin-film solar cells [Ying Guo, Lu Huang & me] Cells Alan Porter • Nano-enhanced Thin-film Solar Cells Director of R&D, Search Technology, Inc. [& Georgia Tech] [email protected] “Hot” topic as shown by relative trends ZnO attracts increasing attention in recent years and is on trend to catch up with TiO2 ratio-recent # Records 1.14 47 0.85 74 0.85 61 0.74 66 0.65 28 0.53 72 0.52 94 0.50 48 0.48 49 0.46 51 0.41 65 0.36 49 0.32 37 0.29 102 0.28 92 0.24 21 0.22 39 0.17 21 0.00 37 0.00 22 0.44 Top 20 Key Terms conjugated polymer fabrication TiO2 chemical vapor deposition amorphous silicon morphology semiconductor fullerene zinc oxide microstructure spray pyrolysis heterojunction CdTe electrodeposition CuInSe2 anatase chemical bath deposition Cu(In sol-gel photoconductivity Top 20 Key Terms combined Ratio of Occurrences 2007-08 to those in 2001-06 New Topics via List Comparison • Create VP sub-dataset for the recent nanoenhanced solar cells publications (new VP file – I used 2007-08) • Create VP sub-dataset for the earlier publications (I used 2001-06) • Under GROUPS, choose LIST COMPARISON; I did so from the select keywords list (82) for 2007-08 and made a new group of those unique to this dataset in comparison to the earlier one. • Results: “characterize” and “deposit” are the 2 novel ones [Warrants in-depth probing to check if these are meaningful] Key Terms by First Year New Key Terms Recently Year 2005 Records 225 New Terms 3 device [8 of 54] TiO2 film [8 of 29] cD [5 of 27] 2006 2007 2008 334 372 174 2 nanocrystal [10 of 25] 2 DEPOSIT [37 of 52] 0 room temperature CHARACTERIZE [4 of 24] [25 of 25] Recent Entrants • We need not restrict the temporal comparison to key terms or topics • Same modus operandi can be applied to identify new or recent entrants to the research (e.g., first papers on the topic from a given organization) • Another variant is the inverse – to look for which participants seem to have abandoned the topic (no publications since Year X) Pod 7: Maps Visualization (Maps) 1. VantagePoint Maps Auto-correlation maps Cross-correlation maps Factor maps 2. Social Network Analysis (SNA) 3. Science Overlay Maps 4. Geo-mapping Auto-Correlation Maps NETFSC Research networking comparison USS (dispersed) vs Germany (1 central organization) USA Germany Auto-correlation vs. Cross-correlation Nano-enhanced Solar Cells Country Research Networks Factor Map (Principal Components Analysis) – groups terms based on their tendency to co-occur across records Social Network Analysis (SNA) • VantagePoint offers several application opportunities Create a sub-dataset for a given country or organization Within that target group, for the given research topic, explore research network connections • Examples Collaborations Shared interests Discrepancies between interests & collaboration • Working with Pajek adds options Calculation of networking statistical measures (e.g., centrality) More mapping nuances Science Overlay Map [see: www.idr.gatech.edu – includes “how to make your own map” and full citations] Agri Sci Geosciences Infec tious Diseases Ecol Sci Env Sci & Tec h Chemistr y Clinical Med Energy & Fuels Biomed Sci. Chemistry, Physical Health Sci Cognitive Sci Materials Science, Coatings & Films Physics, Applied Mtls Sci Materials Science, Multidisciplinary Engr Sci Physics, Condensed Matter Computer Sci Physics Nano-Thin -Film Publications 2001-08 Distribution Ov erlay ov er base 175 Subject Category Science Map Ley desdorff &Raf ols (Forthcoming) – Nanotechnology Thin-film Solar Cells Publications by Research Field Science Overlay Mapping 1. Start with Web of Science file in VantagePoint • • Map the Subject Categories or Cited Subject Categories (somewhat complicated process) • • • • Special import filter to extract cited source titles Applies a special Find/Replace thesaurus to those to make titles more standardized (e.g., J vs. Jnl vs. Journal) We then apply a special macro that uses a Journal-to-Subject Category thesaurus to get Cited Subject Categories (“SCs”) Output a vector file of SCs or Cited SCs 2. In Pajek • • • Select the SCI (175 SC) or SCI+SSCI (221 SC) base map Edit your map (e.g., change node size) Output in desired format (e.g., jpeg) 3. In MS Powerpoint • Overlay on the appropriate base map 4. Or, go to www.idr.gatech.edu/ -- select “Upload Map” Geomapping Geo-map: Nano-enhanced Solar Cells – European Institutions >=10 papers Pod 7+: Activities for Matrices, Trends, Hot Topics & Maps + … “SuperProfile” Research Profiling Interactions/Excercises for Matrices, Trends & Hot Topics **The following exercises may be downloaded at http://www.thevantagepoint.com/webinars.cfm Alan Porter Director of R&D, Search Technology, Inc. [& Georgia Tech] [email protected] Interactive Ideas/Exercises 6. Matrix Fun & Games • In VantagePoint, on your dataset, make a matrix of interest • • • • • Try out matrix operations • • • • Relate analytical possibilities to spell out what MOT questions these could help answer? One family of matrices involve Time (e.g., Year) vs. another variable [“When vs. …] Another family involves Topic (e.g., Key terms, Subject Categories) vs. Performer (e.g., Country, Affiliation, Author) [“What vs. Who”] An important matrix type entails a variable vs. itself (e.g., Author by Author; Country by Country) Flood the matrix to different degrees [use the Up & Down bars in the upper left corner cell (headings by headings cell) Open detail views to explore a group of cells together; select an entry in a detail view to see the records to which it pertains in the title view Paint groups of cells; then re-sort Address one or more MOT questions via your matrix content Interactive Ideas/Exercises 7. Matrix Viz • • • In VantagePoint, with your matrix open, run the MatrixViewer script. [If the view is too cluttered or not interesting, make a more suitable matrix, possibly by creating a group on a particular variable to select key entities.] Try different “Layouts”; select and move entities in the viewer Export the most interesting layout to file. Interactive Ideas/Exercises 8. Activity-Diversity • • • • Make a group of Top Affiliations in your dataset [experiment with this – maybe start with an interesting 15-20]; create a field from group items. Open the Activity-Diversity Scatter 3D script; select that field to plot; select the field to measure Diversity (e.g., Subject Categories; Affiliations); select your minimum; try a Graphic Size. Say “yes” to “make changes to this chart” – and try out various sizes, axis formats, font and label angles – to get a plot you like. [Hint: You can keep redoing – but you can’t edit once you say ‘no.’] Interpret – what can you say about differences in research focus? Interactive Ideas/Exercises 9. Aduna Clustering • • • • • Create a sub-dataset for a country of interest; save the VP file. Create a “top n” (e.g., 10-30) affiliations group in that country dataset. Run the AdunaClusterMap macro for that group Do you spot any interesting inter-institutional collaborations? - any collaborations involving more than 2 organizations? Consider whether such cluster maps could address your MOT issues • • At a higher level (inter-country collaboration investigation) At a lower level (co-authoring patterns) Interactive Ideas/Exercises 10. Plot Matrix (for Trend) • • • • • • In your VP Summary sheet, check if you have “Number of Authors” [alternatively, “Number of Affiliation (name only)”]; if not import (they may be secondary fields in the Web of Science import filter) Make a matrix of Number of Authors by Publication Year Sort; select all values except the last year. Run the PlotMatrix script Examine the resulting plots in MS Excel; pick one you like, or make another (like the colorful plot of affiliations by year in Pod 5) Interpret 11. Hot and New • List Comparison • • • • • • Interactive Ideas/Exercises Pod 6 illustrated use of “List Comparison” to hunt for new terms in recent years; try your own version. Pick a suitable set of key terms. If these are a subset of a large field, it may be handy to make a new field of just those terms (e.g., by using “Group” capabilities) Break your data set to give “recent” and “earlier” based on publication years; create new Sub-datasets. Under the “Groups” menu, select “List Comparison”; compare the same key terms field in the 2 sub-datasets. Start with “Unique” and explore what may be of interest. [Expect lots of noise, but some interesting “new” to discover.] Try out “List Comparison” for other purposes – e.g., compare two organizations for relative emphases. Expectancy Values • • Open your Publication Years field. Show your key terms of interest in a Detail Window [see next slide] Sort in the Detail Window on the Expectancies (terms with triple or double Up arrows are quick candidate “HOT” topics) Another Way to get at Hot Topics Interactive Ideas/Exercises 12. Tracking Term Appearance: Terms by Year • • • Pick a terms field (e.g., “Keywords (author’s)” – but check record coverage Open the Terms by Year macro and run for “First Year,” including Summary report in Excel Examine the resulting VP list – sort by successive years and see if you can spot a set of potentially interesting “new in Year X” terms for recent years Mapping 1. Pod 7 introduced 3 types of VantagePoint maps + a couple of maps that begin with VP analyses, extending to use of other software 2. No separate exercise for Factor Maps in VP here – adapt the ideas presented in Pod 7 to large term sets and try out yourself. No separate exercises for: 3. Science overlay maps [Pod 7 points to a helpful website to make your own maps from Web of Science Subject Category lists] Geo-mapping – Pod 7 presented to illustrate possibilities [there are other ways to create geo-maps from Web of Science affiliation information, processed thru VP, working with mapping software] Interactive Ideas/Exercises 13. Correlation Maps in VantagePoint: Collaboration Patterns within an Organization • • • • • • Select the target organization; create a sub-dataset for it Open the authors LIST; create a group of interesting authors (e.g., top 15) Open the Mapping Wizard; Create an auto-correlation map Then go back to the Wizard and Create a cross-correlation map for those same interesting authors; select a topic field (e.g., key terms or Subject Categories) Compare the maps – open a couple of Detail Windows to explore what is going on – similarities? Differences? Right-click in a map – explore the various options – especially “Edit Preferences” • • • Change the threshold for showing links Change the canvas size Change the font size Interactive Ideas/Exercises 14. SuperProfile! [really versatile ‘research profiling’ tool – provides “breakouts” for a set of entities to show other field values] • From the Scripts menu, select SuperProfile • Pick a field (or group) that you would like to profile (e.g., Country, Subject Category, Publication Year, Highly Cited papers); make selections as the Wizard poses them • In the “Browser” then – Pick Column Type (e.g., Top Items); Pick Field (e.g., Subject Category); Pick # (e.g., how many Subject Categories to list out); Pick minimum # to include (the “Remove items” option); Pick output type – sheet is in VP; try Excel); Add to Profile. • Pick another – Column Type (e.g., another “Top Items” type field) – or let’s try “Percent Recent-Database”; Pick field (Publication Year); Pick # of years to use as “recent”; Add to Profile • Check the MS Excel results; if not quite what you want, redo; if they are what you want, edit for appearance.