Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Session 7 Introduction to Research and Evaluation Topic 1: Research Questions and Hypothesis Testing And Topic 2: Introduction to Statistics For Tonight Today Review the contents of the proposal Topics tonight – Finish the research questions – Types of data review – Hypothesis Testing – Intro to Stats The phases of a research project Problem statement Purpose Hypothesis development / research question(s) Population / Sample type Results reporting (data) Statistical testing Conclusions Recommendations Parts of the Research Report Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 References Appendix Components of Chapter 1 Introduction Background of the study Problem statement Significance of study Overview of methodology Delimitations of study Definitions of key terms Conclusion (optional) Characteristics of Components in Chapter 1 Introduction – 1 paragraph – 3 pages – Gets attention - gradually – Brief vs. reflective opening Background – 2-5 pages – History of problem, etc. – Professional vs. practical use – Be careful of personal intrusions Characteristics of Components in Chapter 1 Problem Statement – ½ page – States problem as clearly as possible Significance of study – 1 pgh. to 1 page – Answers: “Why did you bother to conduct the study?” – Be careful of promising too much Ways to Convey Significance Problem has intrinsic importance, affecting organizations or people Previous studies have produced mixed results Your study examines problem in different setting Meaningful results can be used by practitioners Unique population Different methods used Characteristics of Components Delimitations – as needed – Not flaws – Establishes the boundaries – can study be generalized? – Consider: sample Setting time period methods Stating the Problem Developing a hypothesis : – Methods: estimation and hypothesis testing. Estimation, the sample is used to estimate a parameter and a confidence interval about the estimate is constructed. – Parameter: numerical quantity measuring some aspect – Confidence Interval: range of values that estimates a parameter for a high proportion of the time Hypothesis Testing: the most common use – Hypothesis: an intelligent guess or assumption that guides the design of the study – Null hypothesis: there is no difference or there is no effect – Alternative hypothesis: there is a difference or there is an effect – Hypotheses: more than hypothesis, which are related to the population TYPES OF DATA Variables Two categories: Independent – Variables in an experiment or study which are not easily to be manipulated without changing the participants. Age, gender, year, classroom teacher, any personal background data, etc Dependent – Variables which are changed in an experiment Hours of sleep, amount of food, time given to complete an activity, curriculum, instructional method, etc. Variables A variable: any measured characteristic or attribute that differs for different subjects. Two types: – Quantitative: sometimes called "categorical variables.“ measured on one of three scales: – Ordinal: first second or third choice (most of the children preferred red popsicles, and grape was the second choice) – Interval: direct time periods between two events ( time it takes a child to respond to a question) – Ratio scale: compares the number of times one event happens in comparison to another event. (example: the number of time a black card is pulled in comparison to the number of times a red card is pulled) – Qualitative: measured on a nominal scale. Types of Data Nominal Data -- Data that describe the presence or absence of some characteristic or attribute; data that name a characteristic without any regard to the value of the characteristic; also referred to as categorical data. Male = 1 Female = 2, blue, green, etc Ordinal Data -- Measurement based on the rank order of concepts or variables; differences among ranks need not be equal. interval data -- Measurement based on numerical scores or values in which the distance between any two adjacent, or contiguous, data points is equal; scale without a meaningful or true zero Ratio Data -- Order and magnitude…. Measurement for which intervals between data points are equal; a true zero exists; if the score is zero, there is a complete absence of the variable. – Four levels: nominal: assigning items to groups or categories – Examples: Classroom, color, size Ordinal: ordered in the sense that higher numbers represent higher values – Examples 1= freshmen, 2= sophomore Interval: one unit on the scale represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. – Interval scales do not have a "true" zero point, it is not possible to make statements about how many times higher one score is than another. Ratio: represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. – DO have true zero points Nominal level of measurement Assigns a number to represent a group (gender; geography) Numbers represent qualitative differences (good-bad) No order to numbers Statistics -- mode, percentages, chi-square Ordinal level of measurement Things are rank-ordered -- >, < Numbers are not assigned arbitrarily Assume a continuum Examples -- classification (fr, soph, jr, sr), levels of education, Likert scales Statistics--median (preferred), mode, percentage, percentile rank, chisquare, rank correlation. Interval level of measurement Equal units of measurement Arbitrary zero point--does not indicate absence of the property Example -- degrees, Likert-type scales (treatment), numerical grades Statistics -- frequencies, percentages, mode, mean, SD, t test, F test, product moment correlation Ratio level of measurement Absolute zero Interval scale Examples -- distance, weight Statistics -- all statistical determinations Which are these? Never married Lower middle Class Divorced Age Separated Middle class Widowed Weight Religious Affiliations Height Political Affiliations Distance freshmen Which are these? Never married N Lower middle Class O Divorced N Age I/R Separated N Middle class O Widowed N Weight I/R Religious Affiliations N Height I/R Political Affiliations Distance I/R freshmen O Minutes I/R N Key Point Statistical Significance must be distinguished from practical significance – Even a small difference in a large sample might be significant if the sample is large – No p-value of a .0001 means that 1 in 10000 times the difference observed will occur by chance (no real difference between groups) Example Hypothesis There will be no significant difference in the EOC scores for schools that use CAERT and those that don’t. The EOC exam scores for schools using Caert and those that don’t will not be significantly different. The EOC exam scores for schools using Caert and those that don’t will be significantly different. Statistics for Teachers Statistics “If you can assign a number to it, you can measure it” Dr. W. Edward Demming Statistics – refers to calculated quantities regardless of whether or not they are from a sample – is defined as a numerical quantity – Often used incorrectly to refer to a range of techniques and procedures for analyzing data, interpreting data, displaying data, and making decisions based on data. Because that is the basic learning outcomes of a statistics course. What is the mean medium and the mode in this example? Descriptive statistics Descriptive statistics – summarize a collection of data in a clear and understandable way. Example: Scores of 500 children on all parts of a standardized test. Methods: numerical and graphical. – Numerical: more precise- uses numbers as accurate measure mean the arithmetic average which is calculated by adding a the scores or totals and then dividing by the number of scores. standard deviation. These statistics convey information about the average degree of shyness and the degree to which people differ in shyness. – Graphical: better for identifying patterns stem and leaf display : a graphical method of displaying data to show how several data are aligned on a graph box plot. Graphical method to show what data are included. The box stretches from the 25th percentile to the the 75th percentile historgrams. Since the numerical and graphical approaches compliment each other, it is wise to use both. Inferential statistics For choosing a statistical test variables fall into 2 groups Continuous variables are numeric values that can be ordered sequentially, and that do not naturally fall into discrete ranges. – Examples include: weight, number of seconds it takes to perform a task, number of words on a user interface Categorical variable values cannot be sequentially ordered or differentiated from each other using a mathematical method. – Examples include: gender, ethnicity, software user interfaces Tools for Measuring Measurement is the assignment of numbers to objects or events in a systematic fashion. – Four levels: nominal: assigning items to groups or categories – Examples: Classroom, color, size Ordinal: ordered in the sense that higher numbers represent higher values – Examples 1= freshmen, 2= sophomore Interval: one unit on the scale represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. – Interval scales do not have a "true" zero point, it is not possible to make statements about how many times higher one score is than another. Ratio: represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. – DO have true zero points Data Analysis Explaining and interpreting the data: – Data are plural You are looking at more than one number or group of numbers; subject-verb agreement is important when writing. Central Tendency: measures of the location of the middle or the center of the whole data base for a variable or group of variables – Frequency: the number of times a number appears – Mean: the arithmetic average – Mode: the number that appears most often – Median: the number in the middle when numbers are arranged by value – Skew: A distribution is skewed if one of its tails is longer than the other. Data may be skewed positively or negatively. Standard deviation: the amount of variance between each sigma Inferential statistics Inferential statistics – Infers or implies something about population from a sample. Population: A total group Sample: A few from the whole group Representative sample: a sample that is equally propionate to the population Random Sample: a sample that is chosen strictly by chance is not “hand-picked” – Probability: the percentage of change that an event will occur Parameters vs Statistics Parametric vs Non-Parametric Definitions again – Parameter is the true value in the population of interest (everyone) – Statistics is a number you calculate from your sample data in order to estimate the parameter Example: – All the Ag Teachers of the state – Only 25 teachers selected from the 285 that exist What can make the sample different from the true value/result of the whole? Students taught by teachers using Caert will score higher on end of course exams than those who do not. – True difference – one group actually has a higher capacity to learn. – Random Variations -- The two populations have identical means and the observed differences is a coincidence of sampling – Sampling error (bias) Poorly selected samples not representing the population. Parameters or Parametric Data Parameter: a numerical quantity measuring some aspect of a population of scores. – Parameters are usually estimated by statistics computed in samples Quantity Parameter Greek letters are commonly accepted for writing formulas Statistical symbols are most common in reporting actual data analysis in reports or articles. Greek letters are used to designate parameters Quantity Parameter Statistic Mean μ M Standard deviation σ s Proportion π p Correlation ρ r Stats tests & types of data each use 1 Sample t-test · 1 Continuous Dependent Variable with normal distribution · 0 Independent Variables 1 Sample Median · 1 Continuous Dependent Variable with non-normal distribution · 0 Independent Variables Binomial test · 1 Bi-level Categorical Dependent Variable · 0 Independent Variables Chi-Square Goodness of Fit · 1 Categorical Dependent Variable · 0 Independent Variables· 2 Independent Sample t-test · 1 Continuous Dependent Variable with normal distribution – Wilcoxon Signed Ranks Test · – 1 (2 level) Categorical Independent Variable 1 Continuous Dependent Variable with non-normal distribution · 1 (2 level) Categorical Independent Variable Chi Square Test · 1 Categorical Dependent Variable · 1 (2-level) Categorical Independent Variable Fisher Exact Test · 1 Categorical Dependent Variable · 1 (2 level) Categorical Independent Variable Paired t-test 1 Continuous Dependent Variable with normal distribution, · 1 (2 Level) Categorical Independent Variable One-way repeated measures ANOVA 1 Continuous Dependent Var w/normal distribution – 1 (Multi-Level) Categorical Independent Variable Friedman Analysis of Variance by Ranks 1 Continuous Dependent Var w/ non-normal distribution One-way ANOVA · – – 1 (Multi-Level) Categorical Independent Variable Kruskal Wallis – 1 Continuous Dependent Variable with normal distribution 1 (Multi-level) Categorical Independent Variable 1 Continuous Dependent Variable with non-normal distribution 1 (Multi-level) Categorical Independent Variable Linear Discriminant Analysis Factorial ANOVA – – 1 Continuous Dependent Variable with normal distribution 2 or more Categorical Independent Variables 1 Continuous Dependent Variable with normal distribution 1 Continuous Independent Variable with normal distribution Multiple Regression 1 Continuous Dependent Variable with normal distribution – 1 or more Continuous Independent Variable with normal distribution Linear Regression – 1 Categorical Dependent Variable Multiple Continuous Independent Variables with normal distribution ANCOVA 1 Continuous Dependent Var w/normal distribution – 2 (or more) Categorical or Continuous Independent Variables with normal distribution Results Results At the end of the trial experience schools using Caert had EOC exam scores that were 18% that were higher than those schools that did not use Caert. – Alpha set at p<.05 – Observed P value of .03 Conclusion Interpretation: Given that there is no true (other than scores) difference between schools using Caert and those that don’t, the probability of observing a 3% (.03) or more difference due to chance is less than .05 ANOVA A factorial ANOVA has two or more categorical independent variables (either with or without the interactions) and a single normally distributed interval dependent variable. ANOVA In statistics, ANOVA is short for analysis of variance. Analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance is partitioned into components due to different explanatory variables. – The initial techniques of the analysis of variance were developed by the statistician and geneticist R. A. Fisher in the 1920s and 1930s, and is sometimes known as Fisher's ANOVA or Fisher's analysis of variance, due to the use of Fisher's F-distribution as part of the test of statistical significance. Z test The Z-test is a statistical test used in inference which determines if the difference between a sample mean and the population mean is large enough to be statistically significant, that is, if it is unlikely to have occurred by chance. The Z-test is used primarily with standardized testing to determine if the test scores of a particular sample of test takers are within or outside of the standard performance of test takers. Pearson Correlation The PEARSON Correlation is a calculation between the correlation coefficient between two measurement variables when measurements on each variable are observed for each of N subjects. – (Any missing observation for any subject causes that subject to be ignored in the analysis.) The Correlation analysis tool is particularly useful when there are more than two measurement variables for each subject. It provides an output table, a correlation matrix, showing the value applied to each possible pair of measurement variables. Two-Sample t-Test The Two-Sample t-Test analysis tools test for equality of the population means underlying each sample. The three tools employ different assumptions: that the population variances are equal, that the population variances are not equal, and that the two samples represent before treatment and after treatment observations on the same subjects. Research Techniques Types of hypothesis testing: – T-test: comparing the mean of two groups – ANOVA: Analysis of Variance – used to compare the means of several variables – Correlation: compares the relationship of two groups – Chi Square of independence: explains if is a relationship between the attributes of two variables. – Linear regression: the prediction of one variable based on another variable, when the relationship between the variables is assumed to assumed to be linear. Normal Curve In practice, one often assumes that data are from an approximately normally distributed population. If that assumption is justified, then about 68% of the values are at within 1 standard deviation away from the mean, about 95% of the values are within two standard deviations and about 99.7% lie within 3 standard deviations. This is known as the "68-95-99.7 rule" or the "Empirical Rule". Key points Comparing groups for Sig Diff Key Terms Use for new terms in profession (cognitive processing skills) Give preciseness to ambiguous term (learner) General term used in special way (learning style) Writing definition – State term – Give broad class to which term belongs – Specify how term is used that differs Conclusion – not always used – Summarizes if necessary – Tells reader what to expect Survey Construction Parts: – Title – Directions introduction to survey Scales – Items (a list of statements or questions) Usually with a scale of some type – – – – Rating Ranking Semantic differential Likert type scale Demographical info Likert type scale Ice cream is good for breakfast – Strongly disagree – Disagree – Neither agree nor disagree – Agree – Strongly agree Rating Scale of 1 to 5 or 1 to 7 , etc…. – ? 1 = Best or highest – ? 5 = Best or highest Even number of items or odd? – Forced choice – no fence sitting – Middle – allows a middle ground response – Might allow for not opinion, (NA or NO) Semantic differential “In order to succeed you must know what you are doing, like what you are doing, and believe in what you are doing” Will Rogers Setting Alpha Level Set alpha at something like 0.05 Conduct a statistical test Obtain a p-value Parametric tests – – – – Pearson Product Correlation Coefficient Student t-Test The z-Test ANOVA Nonparametric tests – – – – Chi-Squared Spearman Rank Coefficient Mann-Whitney U Test Kruskal-Wallis Test