Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 8/02 Population = a group of organisms of a given species that live in a particular area at a given time. Sample = A subset of organisms extracted from a population. What is a random sample? Data type 1. Quantitative (Numerical) variables a. discrete = are units Examples: number of children in a family, number of attacks of asthma per week b. continuous = can take any value within a given range. Examples: height, weight 2. Qualitative (Categorical) variables that can be a. ordinal (Ordered categories) - exam results (pass or fail), socio-economic status (low, middle or high) b. nominal (Unordered categories) – examples: eye color, hair color The Concept of ``Distribution'' Measurements on any variable, even the same variable on the same subject, will always vary. The pattern of variation of a variable is called its distribution. The distribution records all possible numerical values of a variable and how often each value occurs (its frequency). The most common example of a distribution is the bell-shaped. The Normal distribution is defined by two parameters, which are the mean and the standard deviation of the population. The curves are symmetrically bell shaped, but the extent to which the bell is compressed or flattened out depends on the standard deviation of the population. The distribution of a variable is of biological interest because it tells us that there is, selection for or against organisms falling in one of the tails of the distribution. If a sample of immature insects, we discover that the measurements are bimodally distributed (with two peaks) this could indicate that the population has different species mix together in the sample. Descriptive statistics A. Statistics of location – describe the position of a sample along a given dimension representing a variable. Mean = widely use measure of central tendency. It is the sum of the measurements / by the number of measurements in the set. Median = It seems to be the preferred measure for describing economic, sociological and educational data. Why? Many of the frequency distributions of measurements are skewed. Calculate the median: Example: A sample of 7 students was given a reading achievement test. Find the median for the test scores: 86 78 90 62 73 89 we must arrange the scores in order of magnitude 62 73 78 86 89 90 95 We have an odd number of 7 measurements; the median is the middle score- 86. Example: Suppose 3 more students took the achievement test and scored: 73 75 91. Determine the median. 73 73 75 78 86 89 90 91 95 Median = 78 + 86/2 = 82 Mode = is the measurement that occurs most often in the set. Is useful in business for identifying products that are in greatest demand B. Statistics of dispersion – how spread is the data around the mean Range - distance between the lowest & the highest reading Standard deviation – is a summary measure of the differences of each observation from the mean, which is the square root of the Variance Variance – The average of the squared deviation of each score from the mean. S2 = x-m)2 / n-1 add up x= value for the variable m = mean of the sample n = total number of measurements or size of the sample / = divided by Hypothesis testing 1. Restate the research hypothesis as a statistical hypothesis and an alternative hypothesis. 2. The statistical hypothesis is called the Null Hypothesis which is symbolized by (H0 ); The alternative hypothesis is called the Alternative Hypothesis (H1 ). you will test the Null hypothesis. 4. Specify a statistical test to be used on the data. 5. Compare the numerical value calculated from the statistical test to a critical value derived from the specific distribution. 6. Reject the Null hypotheses if your calculated value is greater than or equal to the critical or tabled value. 7. State your results in statistical terms using probability terms -- p < .05 (reject the Null Hypothesis) or p > .05 (fail to reject the Null Hypothesis; (p) means probability).