Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Distributions 31/12/10 FANZCA Part I Notes SUMMARY (1) Normal Distribution (2) Other (Standard normal distribution, Data transformation, Binomial, Poisson) NORMAL DISTRIBUTION - an observation that is normally distributed within a population has a norm with random independent factors causing variation from that norm. - most values cluster around the norm with fewer & fewer values towards the tails. - extreme values do exist. - variation is random -> equal spread of values above & below the norm. - mean = median = mode - normal distribution can be plotted to illustrate the frequency of observations or the probability of an observation arising. - the curve is bell shaped, symmetrical and theoretically of infinite size with tails that never react the x axis. - the mean and sd (standard deviation) of a sample likely to be close to the mean and sd of the population from which it was sampled. - the smaller the sample the less likely it will have 'normal' geometry -> less likely the mean and standard deviation will match those of the population. - to determine whether a sample is normally distributed you can plot the data and 'eye-ball' the pattern. - also you can calculate the mean & sd of the observed data & from the frequencies of the values which would be expected with these parameters. Jeremy Fernando (2010) - comparison will tell you whether the sample is 'normal' - population mean = the average value in a real population STANDARD NORMAL DISTRIBUTION - a transformation of the points on a normal distribution into multiples of the standard deviation or standard error from the population mean. - these multiples are termed z values & their distribution is sometimes referred to as z transformation. DATA TRANSFORMATION - allows you to use parametric tests on data that has a skewed distribution by first converting them to a near normal distribution. - can use square root of data or logs BINOMIAL DISTRIBUTION - describes the probability of different proportions of a binary outcome arising in a fixed number of observations. - ie. the probabilities of different proportions of heads arising during sets of coin tosses - the most likely proportion (the norm) in the population = the population proportion (pie) - ie. heads in tossed coins = 0.5 - as sample size increases, it becomes more likely that the proportion of a particular observation within the sample will be the same or similarly to that of the population proportion (sigma) - ie. the more times you throw the coin the more likely the proportion of heads = 0.5 - the larger the sample -> the closer the binomial distribution is to the normal distribution. - the total of all outcomes must = 1.0 - the probability of a specific proportion arising in a sample is calculated using the binomial formula. - the inputs are (1) the proportion you are seeking, (2) the population proportion, (3) the sample number - as the binomial distribution can be approximated to a normal distribution, hypothesis tests such as the normal approximation test can be carried out to determine the probability of a particular proportion arising in a binomial distribution. POISSONS DISTRIBUTION - describes probability of a number of events occurring in a fixed time period or in a region of space. - events must be random and independent of each other. - the probability is calculated from an exponential formula and depends on prior knowledge of one parameter only -> the mean number of occurrences per unit time period (or unit region of space) - ie. if the number of adverse incidents in OT occur in a 2 year period is known, what is the Jeremy Fernando (2010) probability of 5 incidents happening in one day? Jeremy Fernando (2010)