Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Software for data management: The contribution of Stata Dr Karen Robson, Senior Research Fellow, The Geary Institute, University College Dublin, Ireland Getting acquainted with Stata     StataCorp develops and distributes Stata, software for statistical analysis. Stata is available for Windows, Macintosh, and Unix computers. Stata is used by medical researchers, biostatisticians, epidemiologists, economists, sociologists, political scientists, geographers, psychologists, social scientists, and other research professionals needing to analyze data. Gaining popularity in the social and medical sciences Particularly useful for handling large-scale longitudinal data Stata SE (for large data sets)  can analyze datasets with as many as 32,766 variables, and the only limit on observations is the amount of RAM on your computer  can handle string variables with a maximum length of 244 characters  can handle matrices up to 11,000 x 11,000.  requires at least 512 megabytes of RAM and 80 megabytes of disk space Stata/Intercooled (the standard one) can analyze datasets with as many as 2,047 variables, and the only limit on observations is the amount of RAM on your computer  can handle string variables with a maximum length of 244 characters  can handle matrices up to 800 x 800.  Small Stata  A smaller, student version of Stata (for educational purchases only) Stata MP The fastest version of Stata (for dualcore and multicore/multiprocessor computers)  Stata/MP is the fastest and largest version of Stata.  Resources  StataCorp website (www.stata.com) Resources  StataCorp website (www.stata.com)  Timberlake website (www.timberlake.co.uk) Resources  StataCorp website (www.stata.com)  Timberlake website (www.timberlake.co.uk)  UCLA Stata “portal” (http://www.ats.ucla.edu/stat/) Resources  StataCorp website (www.stata.com)  Timberlake website (www.timberlake.co.uk)  UCLA Stata “portal” (statcomp.ats.ucla.edu/stata)  Statalist (www.hsph.harvard.edu/statalist) Resources      StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal” (statcomp.ats.ucla.edu/stata) Statalist (www.hsph.harvard.edu/statalist) Stata Journal (www.stata-journal.com) As well, available Dec 2008 Launching Stata OS contingent  Default window preferences  Window preferences fully adjustable  Auto memory set  Comparing with SPSS  Start up differences Comparing with SPSS Start up differences  With data file open  Comparing with SPSS Start up differences  With data file open  Viewing data   data viewer, data editor Comparing with SPSS Start up differences  With data file open  Viewing data    data viewer, data editor Viewing variables Comparing with SPSS Start up differences  With data file open  Viewing data   data viewer, data editor Viewing variables  Viewing output/commands   output window buffer, log files Comparing with SPSS Start up differences  With data file open  Viewing data   data viewer, data editor Viewing variables  Viewing output/commands    output window buffer, log files Syntax and “do files” Variable window INPUT Stata command window Review window Do file Pull-down menu Computation RESULTS Output window Log file Advantages and disadvantages of Stata       User driven Free STBs Dedicated journal Web active Memory requirements Backward compatible  Change!  SPSS dominance  Orientated to writing syntax/code  Pull-down windows debate! Now in version 8 forward Advantages and disadvantages of Stata  Easier code  Easier data handling  Clarity of operations/ feedback  Results table function  Before version 8, limited graphics  Now, complex graphics  Variable labelling  Editing of output Advantages and disadvantages of Stata       Nested/master do files Flexible terminology Setting types of data Interactive help Switch output (log file) on/off Copy and paste Overview of analytic techniques Too numerous to mention!  Comprehensive manuals  A selection:  All types of regression  Survey package  Epidemiological package  Multilevel modelling  Time series functions  Cluster analysis  Data Data files .dta  Stat/Transfer software  Stata – using wide and long file formats Wide file formats (everything you add goes to the right of the existing data)  Long file formats (everything you add goes underneath the existing data)  MERGE Data 1 Data 2 APPEND Data 1 Data 2 _merge values Data 1 (indi) ‘master’ 1 Data 2 (indj) 3 ‘using’ 2