* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Sampling ‘Scientific sampling’ is random sampling Simple random samples Systematic random samples Stratified random samples Random cluster samples What? Why? How? What is random sampling? Simple random sample -Every sample with the same number of observations has the same probability of being chosen Choose first sample member randomly Stratified random sample – Choose simple random samples from the mutually exclusive strata of a population Cluster sample – Choose a simple random sample of groups or clusters Why sample randomly? To make valid statistical inferences to a population Conclusions from a non-probability sample can be questioned Conclusions from a self-selected sample are SLOP How can samples be randomly chosen? Random number generators (software) Ping pong balls in a hopper Other mechanical devices Random number tables Slips of paper in a ‘hat’ With or without replacement Descriptive Statistics – Graphic Guidelines Pie charts – categorical variables, nominal data, eg. ‘religion’ Bar charts – categorical or numerical variables, nominal or interval data, eg. ‘religion’ or ‘margin debt’; time series or cross sectional data Line graphs – numerical variables, interval data, eg. margin debt; time series data Histograms – numerical variables, interval data, eg. golf scores; cross sectional data – depicts the SHAPE of a frequency distribution Stem and Leaf Plot– quick and dirty histogram Ogive – depicts a cumulative percentage frequency distribution Scattergram – two quantitative variables, eg. Margin vs, the market value Graphic Deception – some widely used methods Graphs without a scale on one axis Captions or titles intended to influence Reporting only absolute changes in value and not percentage changes Changing the scale of the vertical axis with breaks or truncations Changing the scale of the horizontal axis Changing the width as well as the height of bars or pictogram figures Summary of data types and available graphic techniques Numeric Cross-sectional data Time-series data Histograms Percentage histograms Ogives Stem and leaf plots Box plots Line charts Bar charts Nomina; Pie charts Bar charts Complex pie or bar charts Describing the frequency distribution for numerical, cross sectional data Shape Center Spread Describing distributions SHAPE Graphs Histograms Percentage histograms Ogives, Stem and leaf plots Box plots Words Symmetric, skewed, bell shaped, flat, peaked Descriptive Statistics – CENTER Quantitative measures Mean Median Mode Mid-point of the range Descriptive Statistics – Numeric Measures – cont’d. SPREAD (dispersion) Range Symmetric distributions Standard deviation Variance Skewed distributions Quartiles Min Max Interquartile range Percentiles Z Scores and t-scores Measures distance from the mean in standard deviations Eg. T score for bone density – 1 to 2.5 standard deviations below the norm (mean) for a 23 year old indicates osteopenia; 2.5 or more indicates osteoporosis (X-m)/s = z score (X – Xbar)/s = t score Empirical Rule For mound shaped distributions About 68% of observations are within one standard deviation of the mean About 95% of observations are within two standard deviations of the mean Almost all (99.7%) observations are within three standard deviations of the mean Chebyshev’s Rule For all distributions Let k be greater than or equal to 1 At least 1-(1/k2) of the observations are within k standard deviations of the mean Examples K=1 zero observations may be within one standard deviation of the mean K=2 3/4th’s of observations must be within two standard deviations of the mean K=3 8/9th’s of observations must be within three standard deviations of the mean