* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 7 - 1 - Wiley
Survey
Document related concepts
Transcript
Management Science: The Art of Modeling with Spreadsheets, 2e Chapter 7: Data Analysis for Modeling S.G. Powell K.R. Baker © John Wiley and Sons, Inc. PowerPoint Slides Prepared By: Alan Olinsky Bryant University 7-1 7-1 Data Analysis in the Context of Modeling Supports the modeling process Improves accuracy of model Improves usefulness of conclusions Modeling is the primary goal. Data analysis is a means to that goal. 7-2 Topics for Chapter Finding facts in databases Editing, searching, sorting, filtering, and tabulating Sampling Estimating parameters Point estimates and interval estimates 7-3 Finding Facts from Databases Tables of information Each row is a record in the database. Each column is a field for the records. Excel calls such a table a list. 7-4 Excel Lists First row contains names for each field Each successive row contains one record. Lists may be: Searched and edited Sorted Filtered Tabulated 7-5 Searching and Editing Lists First assign a range name to entire list. Include column titles. With list selected choose Data – Form. Examine records one at a time: Find Prev. Find Next. Enter new record with New button. Delete record with Delete button. 7-6 Database Form 7-7 Criteria Button Found under Data – Form Allows for searching of records Enter data into a field. Click Find Next. 7-8 Alternate Excel Search Techniques Highlight entire database. Use Edit – Find to search. Use Find and Replace to edit entries. In Find and Replace “?” stands for any single symbol “*” stands for any sequence of symbols 7-9 Sorting: Data – Sort Command 7 - 10 Filtering Select database then Data – Filter – AutoFilter. Will filter lists based on values Found under arrow at the title of each column Arrow on title turns blue to remind list is filtered Can remove filter by: Select (All) using the list arrow; or Selecting Show All under Data – Filter 7 - 11 More Filtering Top 10 option returns records with smallest or largest value of a numerical record Custom option allows filtering with compound criteria More complicated compound criteria can be achieved with Data – Filter – Advanced Filter submenu. 7 - 12 Tabulating Select Data – Pivot Table. Creates summary tables Layout button on third step of wizard creates the format for the table 7 - 13 Analyzing Sample Data Data is unlikely to cover whole population Work with sample from population Statistics are summary measures about sample Want to construct statistics that represent population Convenience sampling Have easy access to information on subset of population Subset may not be representative Random sampling All objects in population have equal chance of appearing in sample 7 - 14 Descriptive Statistics Summarizes information in sample Gives numerical picture of observations Excel Tools – Data Analysis Descriptive Statistics table produced based on data given as input 7 - 15 Inferential Statistics Use information in sample to make inferences about population Systematic Error If sample not representative of population Avoid by careful sampling Sampling Error Sample is merely subset of population Mitigated by taking large samples 7 - 16 Estimating Parameters: Point Estimates The sample average is calculated as: x x n n i 1 i The sample variance is calculated as: (xi x )2 s n 1 i 1 2 n and its square root is the sample standard deviation: n s 2 (x x ) i i 1 n 1 7 - 17 (Optional) Estimating Parameters: Interval Estimates We can estimate parameters in two ways, with point estimates and with interval estimates. The interval estimate approach produces a range of values in which we are fairly sure that the parameter lies, in addition to a single-value point estimate. A range of values for a parameter allows us to perform sensitivity analysis in a systematic fashion, and it provides input for tornado charts or sensitivity tables. 7 - 18 Interval Estimates for the Mean P(L <= m <= U) = 1 – a. L and U represent the lower and upper limits of the interval. 1 – a represents the confidence level. Usually a large percentage like 95 or 99% m represents the (unknown) true value of the parameter. 7 - 19 Sampling Theory Working with a population described by a Normal probability model Mean m and standard deviation s. Take repeated samples of n items from population Calculate the sample average each time The sample averages will follow a Normal distribution with a mean of m and a variance of s2/n. 7 - 20 Estimates Standard error: the standard deviation of some function being used to provide an estimate. Use the sample average to estimate the population mean. The standard deviation of the sample average is called the standard error of the mean: sx s / n 7 - 21 Z-scores The z-score measures the number of standard deviations away from the mean. The z-score corresponding to any particular sample xm xm average is: z sx s n Tells how many standard errors from the mean 90% of the sample averages will have z-scores between –1.64 and +1.64. The chances are 90% that the sample average will fall no more than 1.64 standard errors from the true mean. 7 - 22 Confidence Intervals for Means Upper and lower limits on estimate for mean: x z(s / n ) n>30 recommended unless original population resembles Normal z can be computed using NORMSINV(1-a/2) Replace s by the sample standard deviation s Provided that sample is larger than n = 30 Excel Descriptive Statistics also will calculate halfwidth of confidence interval 7 - 23 Interval Estimates for a Proportion To estimate the sample proportion p, the interval estimate is: p(1 p) pz n Sample size should be at least 50 for this formula to be reliable 7 - 24 Sample Size Determination Suppose want to estimate mean of sample to within a range of ±R n = (zs / R)2 Assumes: Sampling from Normal distribution Known variance – can begin with small sample to estimate standard deviation 7 - 25 Sample Size Determination for Proportions Suppose want to estimate a proportion to within a range of ±R n = z2p(1 – p) / R2 Value maximized at p = 0.5 Conservative value: n = (z/2)2 / R2 7 - 26 Summary Data collection and analysis support the modeling task where appropriate. When early sensitivity testing indicates that certain parameters must be estimated precisely, we turn to data analysis for locating relevant information and for estimating model parameters. The process of finding facts in data is aided by a facility with Excel and in particular with its database capabilities. Excel provides an array of commands for searching, sorting, filtering, and tabulating data. Excel’s Data Analysis tool for calculating descriptive statistics enables rapid construction of point estimates and interval estimates from raw data. 7 - 27 Copyright 2008 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information herein. 7 - 28