Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 7110 Instructor: Athanasios C. Micheas, Ph.D. Midterm examination (in class) MDLBH 7, 10:00-10:50 p.m., Friday, October 28, 2011 Directions: Create a doc file named “’your name’ SAS exam.docx” and enter there all your output, comments, plots etc. Email the file to the instructor at the end of class. Clearly mark your answers, and be sure to answer all questions. Make sure you include all your input (sas code) and output (output window and graphics window). Answer all posed questions using SAS procedures only, not proc insight or the ASSIST module. The dataset needed for the problems can be found in the class website at http://www.stat.missouri.edu/~amicheas/stat305/datasets. Work on the problems alone; send the file to: [email protected]. Problem 1. (40 pts) A summary of the U.S. population current, past and projected can be found in the file USPOP.dat. The study was conducted in 1991 by the U.S. population reference bureau. The variables are: section: section of the country, NE= New England, MW=Midwest and so forth zone: coded time zone state: state pop1991: population in year 1991 in thousands pop1990: population in year 1990 in thousands pop1980: population in year 1980 in thousands pop2010: population in year 2010 in thousands area: total area in thousands of square miles popdens: population density in people per square mile medage: average age perc18: percentage (proportion) of population under 18 years as of 1991 perc65: percentage of population above 65 years as of 1991 coded: coded section a) (25 pts) Create a SAS program that will read the data. In order to visualize differences in the average proportions of people younger than 18 and people older than 65, with regard to the section of the country, produce a plot with variable section on the x-axis and perc18 and perc65 on the y-axis. Which section of the country has the highest average percentage of young people? (younger than 18) Which has the highest average percentage of older than 65 people? Is it the same section? Use green color for the perc18 connected line and blue color for the perc65 line. You should connect the points on the graph. (Hint: You will need to create a dataset that contains means of perc18 and perc65 for each section to answer this problem. For the plot use the overlay option in the plot statement, and two symbol statements with the appropriate color and interpol options) b) (15 pts) We wish to assess equality of the average proportions between people younger than 18 (perc18) and people older than 65 (perc65), for the year 1991, using a statistical procedure. Conduct a formal test for equality of the average proportions and comment on the results (use a=.05). Make sure to check the validity of the test. You may certainly assume independence between the two age groups. Problem 2. (60 pts) a) (20 pts) Create a sas program that will contain variables named x1-x12, each containing 20 generated values from a binomial distribution with sample sizes 1,2,3, ...,11,12 respectively and probability of success .3 for all variables. (HINT: recall the do-loop examples…) b) (20 pts) Create a horizontal bar chart displaying 12 bars, with each bar having length equal to the corresponding average of x1, x2,..,x12 from the generated data. Note: Since we know what the theoretical means of the random variables x1-x12 are, namely i*.3, i=1,2,...,12, the graph should look like a ladder going up. c) (20 pts) Looking at the generated data as a 20x12 matrix of integers, compute the frequency of each possible value 0,1,2,...,12 that the variables x1-x12 may take, and print your results. (Hint: Create a variable taking values 0,1,2,...,12 and another variable freq that will contain the frequency of those values in the table.)