Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Research Methods in Language Issues …………………………………… Part I Preliminaries of Research 1 Chapter 1: What is research? ……………………………………….. Chapter 1 What is Research? 2 Research Methods in Language Issues …………………………………… 3 What is research? Curiosity is every part and parcel of mankind. Human beings are born curious. Right from the time little children learn to speak, they begin to ask questions and seek answers to their questions. It is this curiosity to which we owe most, if not all, of our present body of knowledge. In their attempts to find answers to their questions, human beings experience and learn new things. These attempts have been one of the basic sources of human knowledge throughout history. However, the kind and shape of these efforts have changed drastically over time. Early mankind obtained information in some very simple traditional ways. Today, technology and modern equipment have made it possible for mankind to carry out rather complicated and systematic investigation of various phenomena. In simple terms, research refers to this systematic approach we use to answer our questions. Sources of information In the course of history, our ancestors accumulated knowledge in ways that were not always as scientific and systematic as today’s research conducted in well-equipped laboratories under strictly controlled conditions. The traditional sources of obtaining information include sensory experience, authorities, and logic. Chapter 1: What is research? ……………………………………….. 4 A. Sensory experience One of the earliest and most immediate sources of information is the personal experience we get through our senses: seeing, hearing, smelling, taste, and touch. Each of these senses is a valuable source of information. How do we know someone is at the door when the door is shut and we are inside? We hear the knocking. You are chatting with a friend on the phone. All of a sudden, you say: “I have to go now. The food is burning.” How do you know? You can smell. Even little Johnny, who is only six years old, knows that a certain food is more delicious than another. How? He can taste. Despite its obvious value as a source of information, sensory experience has a number of shortcomings. The first problem is that information obtained through the senses is not always accurate; our senses sometimes mislead us. As an example look at the following lines. Are they parallel? See?! Your eyes misled you. Research Methods in Language Issues …………………………………… 5 Moreover, the information we get through sensory experience is relative. Two eyewitnesses, reporting a car accident to the police, in all their honesty, may give two different or even conflicting accounts of what happened. This means that their eyes have seen the same thing differently. Or, put the forefinger of your right hand in a glass of hot water and the forefinger of your left hand in a glass of cold water. Obviously, your right-hand finger will feel hot and the other cold. Now take both fingers out and put both of them in a glass of luke-warm water. Although they are in same glass, the finger that was in the hot water feels cold and vice versa. In short, although sensory experience is a valuable source of information, the accuracy and reliability of the information obtained this way is not always guaranteed. In addition, experiencing things to get information is not always a safe venture. Obviously enough, no normal parent can afford to let their fouryear-old child to know what happens if they overthrow a kettle of hot water on their body by experiencing it. This kind of knowledge does not do the child much good since by the time s/he understands this, the poor kid also comes to realise that there isn’t much left of his/her life. Finally, it goes without saying that not every piece of information can be obtained simply by the use of the senses. In fact, much of our present body of knowledge has been accumulated via some carefully Chapter 1: What is research? ……………………………………….. 6 controlled and systematic studies under rather special circumstances. B. Authorities Another way of getting access to information is to refer to authorities. Obtaining information through experience takes time. People can share experiences with others to save time. Instead of waiting for a chance to experience something, one can easily consult someone who has already experienced it. Experts and authorities in various fields are the people who have spent a considerable portion of their time studying certain phenomena. So they can be quite a reliable source of information. That’s why, for example, when we catch a cold, we consult a doctor. Everyone will agree that the doctor has specialized knowledge and experience, and that his advice is more accurate and reliable than someone without such knowledge and expertise. Even expert opinion, however, cannot be regarded as mere fact. After all, experts are human, and humans are apt to make mistakes. In addition, the information provided by the experts might, for one reason or another, be biased. And sometimes, the information might undergo a complete transformation before it reaches its final destination. For these and some other reasons, the information provided by the experts and authorities should be treated cautiously. Research Methods in Language Issues …………………………………… 7 C. Logic Humans are endowed with a God-given blessing called ‘logic’. By this, they can reason, think, and learn new things. Logic is of two types: deductive and inductive. Deductive logic moves from general statements of facts to more specific conclusions. For example, if you are to meet a person, named Mark, whom you have never met before, you already know that ‘Mark breathes’. This is because you reason that: A : Human beings breathe. B : Mark is a human being. So, C : Mark breathes. Or, you naturally avoid drinking from a bottle on which the word ‘poison’ is written because your logic tells you: A : poison is dangerous. B : Liquid X is a poison. So, C : Liquid X is dangerous. In both of these examples, sentence ‘A’ is called a major premise, ‘B’ a minor premise, and ‘C’ a generalization. In case of the latter example cited above, your deductive logic saves your life, but it isn’t always as helpful. Sometimes, using false major and minor premises may lead you to wrong conclusions. Only men have logic. Women are not men. Women do not have logic. This is due to the use of a wrong premise. ‘Men’ in the first sentence means ‘human beings’ not the Chapter 1: What is research? ……………………………………….. 8 ‘male humans’. Even correct major and minor premises may also lead to false conclusions due to the fact that there are exceptions in nature. You know that ‘birds can fly’. Then you learn that ‘penguin is a bird’. So, you conclude ‘penguins can fly’. Inductive knowledge moves from individual and specific statement of facts to general conclusions. In the afore-mentioned examples, we started with some general statements such as ‘poison is dangerous’, and ‘birds can fly’. But where do these statements come from? How do we know that, say, birds fly? We know that birds fly because we have observed many individual members of the bird family fly. Then we have made a general conclusion. Sparrows are birds, and they fly. Pigeons are birds, and they fly. Crows are birds, and they fly. So Birds fly. Again it must be noted that because of our limited knowledge of nature, such generalizations can hold true only as long as there is no evidence to the contrary. Once we learn that penguin and some other birds cannot fly, the general statement loses its validity as a fact, or at least, needs modification. Owing to the limitations of each of the abovementioned sources of information, nowadays scientists, while still exploiting those traditional sources) employ scientific and systematic methods of obtaining information. Using the scientific method, scientists have explored aspects of the universe that were, for so Research Methods in Language Issues …………………………………… 9 long, unknown to us. On the one hand, they know many things about stars and planets that are millions of light years away from us. On the other hand, they have learnt how to bombard atoms, split them, and learn things about them which only a few decades ago were simply inconceivable. It is the scientific method to which we owe most of our present technology, and it is this method which we will focus on from this chapter on. Goals of research Scientific research is usually done for one or more of the following four purposes: 1. Description 2. Prediction 3. Improvement 4. Explanation Each of these goals will be briefly looked at below. 1. Description A good description of the nature of any phenomenon is both necessary and useful for a better understanding of the way it interacts with other phenomena, the reasons for these interactions, and the way(s) these interactions can be handled. In other words, before we can do anything about a phenomenon in a systematic way, we first need to know what it is. Therefore, the purpose of many researchers is to describe an event or a phenomenon. Many questions such as the following can be answered by description. Chapter 1: What is research? ……………………………………….. 10 1. What is the sequence of morpheme development in Iranian children? 2. In which order do Iranian children develop their knowledge of tenses? 3. What percent of the Iranian teachers use the Communicative Approach in their classes? 2. Prediction Usually, description is not the end point of research. Description may be necessary to achieve one or more of the other three goals. Prediction is one such goal. Descriptions may be needed for making predictions about the future. For instance, the description of the Iranian university student population will tell us things like the percentage of male and female students, the ratio of employed students to unemployed ones, etc. Such information may not always be valuable per se, but it proves quite useful in helping the officials predict what problems they might face in the future and think of solutions before it is too late. Suppose the description of student population in Iran shows that the ratio of female to male university students is two to one. This means that before long, the country will have twice as many graduates as male ones. So, the government needs to create more jobs that are suitable for females. Or consider weather forecasts on TV and radio channels. Meteorologists study and describe the changes in the earth’s atmosphere in order to forecast the weather. Research Methods in Language Issues …………………………………… 11 3. Improvement One of the major reasons why research is done is to improve the present conditions. Consider carmanufacturing companies. They spend huge sums of money on research projects annually to improve the quality of their products. Similarly, in the area of language teaching and learning, extensive research is directed towards greater efficiency in this profession. 4. Explanation Earlier, it was pointed out that research originates in human curiosity, the desire to know the reason(s) for everything that happens around them. Men experience many new things, and intrinsically come up with many ‘why’s. In an attempt to find explanations for these new mysteries, they conduct research. Characteristics of research Regardless of the purpose for which it is conducted, research should have certain characteristics. It should be : 1. Systematic 2. Generative 3. Reductive 4. Replicable 5. Logical Chapter 1: What is research? ……………………………………….. 12 1. Research is systematic This means that at each stage of research, researchers follow a number of pre-established steps and procedures. When researchers publicize the outcome of their research, they intend to share their findings with others. Systematicity means that they should follow certain already-established regulations, known to other researchers, so that comprehension and interpretation become easier. 2. Research is generative The word generative means ‘productive’ or ‘creative’. Research generates something. What? Questions. This may sound strange. If research is carried out to answer questions, how can it generate questions? Well, in fact it answers the question under investigation, but generates many other questions. Suppose you are conducting a research on the relationship between age and language learning. You have selected a group of young and a group of old subjects. You intend to compare their learning of English. While doing your research, you notice that the ratio of male subjects to female subjects in the two groups differs. This way a new question comes to your mind, “Does gender influence language learning?” You may also notice that the linguistic background of subjects in the two groups varies. Another question avails itself, “Does linguistic background affect L2 learning?” The more you try to answer these questions, Research Methods in Language Issues …………………………………… 13 the more you realize how many more questions there are yet to answer. 3. Research is reductive This characteristic can be viewed from two perspectives: conceptual and practical. From the conceptual perspective, research reduces many individual statements of fact into fewer but more general statements. As an example, consider medical research. One research may arrive at the conclusion that a new drug (A) is good for curing the disease (D). Another research may end up with the conclusion that another drug (B) is also conducive to curing the same disease. The outcome of still another research may suggest that a third drug (C) has similar effects on the same disease. Now, comparing and contrasting the three drugs, researcher come up with the conclusion that the drugs A, B, and C have the element X in common. Thus, the three individual statements are reduced to a single statement, that the element X cures the disease D. From the practical point of view, research reduces the responsibility of other researchers. Due to the huge size of the human reservoir of knowledge and the multitude of questions waiting for answers, no one can answer all the questions and explore all the mysteries surrounding their field of specialization. For this reason, once a researcher explores one aspect of an issue and the finding is confirmed by some confirmatory research, the responsibility to carry out Chapter 1: What is research? ……………………………………….. 14 the same research will be off the shoulders of other researchers. They will take the result and use it as a base on which to make their own contribution to human knowledge. 4. Research is replicable It was said that research reduces the responsibility of other researchers. But this is so only after the findings are confirmed by a number of other studies. In other words, the result produced by research should be repeatable. If the same research is repeated under the same conditions, the same results should be obtained; otherwise, the outcome cannot be used as a valid piece of information. 5. Research is logical Finally, logic should be applied from the very beginning up to the very end of research. In all the steps, logic has to be applied to make sure that what is done makes sense. Imagine how strange it can be to give a test of listening comprehension to a group of students to investigate the effect of the reading practice on the writing ability! Kinds of research Before explaining kinds of research, a distinction should be made between kinds and methods of research. Method refers to the practical steps and procedures employed in the research process. Three methods of research, namely historical, descriptive, and Research Methods in Language Issues …………………………………… 15 experimental methods will be elaborated in later chapters. Kind, on the other hand, refers to the nature of research and has nothing to do with the procedures employed. So far as kind is concerned, research is either exploratory or confirmatory. Exploratory research is the kind of research that is done for the first time to explore a mystery. We owe much of the discoveries mankind has made to this kind of research. Confirmatory research, on the other hand, is the kind of research that is conducted after the exploratory research to see whether or not the results obtained via exploratory research can be confirmed. Simply put, it is the exact or partial replication of previous research in order to either consolidate or refute it’s findings. Each of these two kinds can be either pure or applied. Pure research is done to satisfy human curiosity. It may have no real application in the real world. It is the kind of research done for the sake of research itself. In applied research, however, the finding of research is used for some practical purpose. Applied researchers apply the findings of pure researchers to the real world. Pure researchers are simply interested in discovering the laws of, say, physics while applied researchers try to utilize the discovered laws in car-manufacturing companies to produce cars with certain characteristics including weight, speed, aerodynamic structure, etc. Chapter 1: What is research? …………………………………………….. 16 In a nutshell, since the two classifications of research kinds are not mutually exclusive, one can imagine the following four kinds of research: Pure Applied Exploratory 1 3 Confirmatory 2 4 Research Methods in Language Issues …………………………………… Chapter 2 The Concept of Variable 17 Chapter 2: The concept of variable ……………………………………….. 18 What is a variable? Research is the study of the relationship between two or more variables. But what is a variable? As the name speaks for itself, a variable is anything that can vary. It is an attribute or a characteristic that changes from person to person, object to object, time to time, situation to situation, etc. Weight is a variable because it differs from one person or object to another. Temperature is also a variable since it varies from place to place and from time to time. Kinds of variables Some variables such as weight, age, length, temperature can be directly observed or measured. Such variables are called concrete variables. Using a thermometer, for instance, one can easily measure temperature. Some other variables including intelligence, anxiety, language proficiency, etc. are not directly observable. Of course, they do have some manifestations and can be indirectly estimated or measured through their manifestations. Nevertheless, they cannot be directly observed and measured. They are called abstract variables. Variables like gender, left-handedness, marital status are discrete variables. Their nature is of all-ornothing type. That is to say, they either exist or do not. For example, every normal person is either male or female, either left-handed or right-handed; there is no third possibility. On the other hand, variables like Research Methods in Language Issues …………………………………… 19 intelligence, temperature, age, etc., which can represent a range of possibilities and form a continuum where every person or object may have a different degree of the attribute , are continuous variables. Thus, concrete variables are those that can be directly observed and measured while abstract variables cannot be directly observed and measured. In addition, when a variable is of all-or-nothing (either – or) nature, it is discrete, but when there are different degrees of the attribute, the variable is continuous. Again it must be remembered that the distinctions made between concrete versus abstract and discrete versus continuous are not mutually exclusive. This means that both concrete and abstract variables can be either discrete or continuous and vice versa. Hence, there can be four different kinds of variable. discrete continuous concrete 1 3 abstract 2 4 Functions of variables Earlier it was mentioned that research is the study of the relationship between two or more variables. Depending on the kind of research question and the kind of relationship between variables, variables will have one of the following functions: independent variable dependent variable moderator variable control variable intervening variable Chapter 2: The concept of variable ……………………………………….. 20 An independent variable is a variable that is selected and manipulated by the researcher to study its effect on the dependent variable. It stands by itself and does not depend on any other variable, and is selected on the basis of the sheer interest of researchers. A dependent variable is one that is carefully observed and measured to determine the extent of the effectiveness of the treatment (the independent variable). An example may help clarify the point. Suppose a researcher wants to study the effect of age on language learning. To conduct such a study, first of all, the researcher should operationally define the variables, i.e., s/he must define variables in a way that can be objectively measured and quantified. In this case, the researcher should clarify what s/he exactly means by age; what range of age is considered young, middle-aged, and old. Language learning needs to be defined in a similar way. Suppose that language learning is defined as the subjects’ performance (score) on a language proficiency test. In the above-mentioned example, the choice of ‘age’ as a factor influencing language learning is totally up to the researcher. There are many other factors that influence language learning. This researcher has just chosen ‘age’. So, the variable ‘age’ , which has been selected by the researcher and does not depend on any other variable is the independent variable. On the other hand, the researcher has no choice over the degree of language learning by the different age groups. He cannot choose which age group should learn how much Research Methods in Language Issues …………………………………… 21 language. The degree of language learning will be influenced and determined by the independent variable (age). Thus, language learning, defined as the subjects’ score on a language proficiency test, is the dependent variable. To conduct such a study, after randomly selecting the subjects and assigning them to the young and old groups, a treatment (certain amount of instruction) will be given to both groups under similar conditions. Then a language proficiency test will be administered to both groups. Comparing the scores of the two groups on the test, the researcher makes conclusions about the effect of age on language learning. Assuming that the mean (average score) of the young group is 16 and that of the old group is 14, and assuming further that there are ten subjects (male and female) in each group, the researcher may conclude that young age positively influences language learning. Now suppose that while analyzing the data, the researcher notices that the ‘age’ of the subjects has affected the performance of male and female subjects differently as shown below: Age Young Old Sex Male Female Male Female Mean 14 18 14.4 13.5 Grand mean 16 14 Chapter 2: The concept of variable ……………………………………….. 22 Although the grand average of the young group is better than that of the old group, a closer look suggests that apparently the young age of the male subjects does not positively influence their language learning. A comparison of the mean scores of the young and old male subjects shows that the young subjects have even a slightly lower score than their old counterparts. The reason for the original conclusion, that young age influences language learning positively, seems to be the sharp difference in the mean scores of the young and old female subjects. The researcher now moderates the original strong claim and states the more moderate conclusion that young age positively affects the language learning of females. In this example, the variable ‘gender’, which influences the outcome of the research and moderates its finding, is called a moderator variable. Assume further that someone who is critical of the above research questions the outcome of the study claiming that the difference in the scores of the young and old females may not be necessarily caused by their age. There may well be other factors at work. S/he then argues that since the linguistic backgrounds (first language) of the subjects were different, the difference in their scores might actually have been caused by their linguistic background rather than their age. You see!! The criticism sounds quite relevant and due. To avoid facing such criticisms, the researcher needs to control other variables that are not under investigation but may influence the outcome of the study. Such factors should Research Methods in Language Issues …………………………………… 23 be made identical in both groups so that they do not influence the outcome of the research. Hence, variables that are held constant to prevent their possible effect on the outcome of research are called control variables. It should be pointed out that not all of the variables influencing the outcome of research can be controlled. In the above example, for instance, the researcher can control variables like linguistic background by selecting both young and old subjects from a single linguistic background. Nonetheless, there are variables such as the mood of the subjects, stress, fatigue, etc. that influence the outcome but cannot be controlled. Such variables are intervening variables. There is a second understanding of intervening variables. In our example, it was mentioned that on the basis of the scores of the subjects in the two groups (young and old), the researcher drew the conclusion that young age influences language learning. But you remember that language learning is an abstract variable and cannot be directly observed and measured. What the researcher observed and measured was the scores of the subjects not their language learning. Yet, in drawing such conclusion, the assumption is that language learning stands between age and the scores of the subjects (the independent and the dependent variables). This means that the age of the subjects influences their language learning, which in turn, influences their scores. Such a variable is an intervening variable. Chapter 2: The concept of variable ……………………………………….. 24 Variable scales Different variables require various scales of measurement. There are four kinds of variable scales: 1. nominal scale 2. ordinal (rank order) scale 3. interval scale 4. ratio scale 1. Nominal scale The nominal scale is a scale of measurement in which numbers are used only to name or to code data, and do not have any mathematical value. Earlier it was explained that some variables are discrete (of all-ornothing type), like nationality, gender, etc. If the research question includes a discrete variable, for instance if it is about the effect of sex on language learning, obviously two groups of subjects will be needed, male and female. ‘Gender’ is a discrete variable. Everyone is either male or female. So the male subjects will have the feature [+ male] or just + and the females will have the feature [- male ] or simply -. Instead of using + and - , one can also use numbers. The male subjects can be assigned to group 1 and the female subjects to group 2 or vice versa. The numbers used here have no mathematical value, that is, no group is mathematically superior to the other; the numbers are just names for the two groups. Research Methods in Language Issues …………………………………… 25 2. Ordinal scale Unlike nominal scale, in ordinal scale numbers have mathematical value. We know that abstract variables like intelligence and anxiety can not be directly observed and measured. We can never see or observe someone’s intelligence or anxiety. What we can do in measuring such variables is to compare the subjects and rank them according to the proportional degree of the attribute they have. We may decide, for example, that subject A is the most intelligent, B the second most intelligent, C the third, and so forth. Here, too, numbers can be used to name the subjects: Subject A B C degree of attribute the most intelligent the second most intelligent the third most intelligent rank 1 2 3 This time, however, numbers are mathematically significant and cannot be used interchangeably. Moreover, although the intervals (differences between the ranks) are mathematically equal, they are not actually equal. Namely, the difference between the first and the second subjects in the rank is not the same as the difference between the second and the third. 3. Interval scale Interval scale is similar to ordinal scale. The only difference is that in ordinal scale intervals are not equal while in interval scale they are. This is the scale Chapter 2: The concept of variable ……………………………………….. 26 that is most commonly used in educational settings. The scale used to measure students’ achievement in most classes has 21 levels ranging from 0 to 20. 4. Ratio scale It was pointed out that to measure language learning, we use an interval scale in which scores can range from 0 to 20. The score zero given to someone does not indicate any specific degree or level of an attribute; it simply means that the attribute does not exist. When measuring the temperature, on the other hand, zero does not mean that there is no temperature. If it was -5° C yesterday and 0° C today, this means that today it is 5° C warmer than yesterday. Zero has true value here. It indicates a specific meaningful level which stands between –1 and +1. Ratio scale is much like the interval scale the only difference being that the ratio scale has a true zero. It goes without saying that the ratio scale is at the same time interval because it has all the characteristics of the interval scale plus an additional feature (a true zero). By the same token, an interval scale is also ordinal and an ordinal scale is nominal, but not vice versa. Thus, each of the four measurement scales described before is convertible to the previous scale but not to the following one. Research Methods in Language Issues …………………………………… Part II How to Conduct Research 27 28 Introduction In part I, research was defined, sources of information, goals, characteristics, and kinds of research were briefly explained. Also, the concept of variable as well as kinds, functions, and measurement scales of variables were clarified. From this point on, the focus of attention will be on how to conduct research. Remember that one of the characteristics of research is that it is systematic, i.e., there are certain steps that should be followed in conducting research. In part II, these steps will be dealt with. The steps that should be followed in conducting research include : 1. forming research question 2. formulating research hypothesis 3. reviewing the relavent literature 4. selecting a research method 5. collecting, summarizing, and analysing data 6. reporting research findings Each of these steps will be discussed in detail in the following chapters. Research Methods in Language Issues …………………………………… Chapter 3 Research Question, Research Hypothsis, and Literature Review 29 Chapter 3: Research question, research hypothesis and literature review … 30 Characteristics of research questions Research is a systematic way of finding answers to questions. This means that any research begins with a question. However, not all questions are good research questions. A good research question should have a number of characteristics including the following: 1. interest 2. relevance 3. managability 1. Interest The first, and probably the most important, characteristic of a research question is that it should be of interest to the researcher. Otherwise, the researcher will not commit him/herself to the task, and will conduct the research only to fulfil an assignment. 2. Relevance A good research question is also relevant to the needs of both the researcher and the community or society of which s/he is a member. If an Iranian researcher intends to carry out research, obviously the question “In what order do Iranian highschool students learn the tenses of English?” will be much more relevant than “In what order do South Korean highschool students learn the tenses of Chinese?” Due to the limitations in financial resources and manpower, priority should be given to a kind of research the Research Methods in Language Issues …………………………………… 31 finding of which benefits the immediate community where the researcher lives. 3. Managability The research question should be practically feasible to investigate. No matter how interesting or relevant a research question is, it will be no good unless it is managable, that is, it is not too broad or general to be answered in a single research. To be managable, a research question should be narrowed down; its scope should be made limited so that it becomes practically possible to conduct. Types of research questions There are three kinds of research questions: 1. descriptive questions 2. correlational questions 3. cause – effect questions 1. Descriptive questions The purpose of a descriptive question is to describe something. Such questions ask about the duration, frequency, sequence of occurance, etc. of an event. As an example, the question “In what order do Iranian children acquire the tenses of persian?” is a descriptive question, requiring an observation and description of the sequence of tenses that appear in the language production of Iranian children. Chapter 3: Research question, research hypothesis and literature review … 32 2. Correlational questions Correlational questions seek to find out the relationship between two factors (variables). A typical way of asking a correlational question is “what is the relationship between A and B?” For example : “What is the relationship between students’ knowledge of History and their knowledge of English?” Remember, however, that when there is a relationship between two variables, it doesn’t mean that one variable causes or influences the other. In the above example, students’ knowledge of history neither facilitates nor hinders their ability to learn English. In other words, knowledge of History does not create knowledge of English or vice versa. Perhaps both factors are influenced by another factor, say, intelligence. 3. Cause – effect questions Cause–effect questions ask about the causal relationship between two or more factors. They ask about the effect of one factor on another. A typical cause-effect question will ask “what is the effect of X on Y”. For instance, “what is the effect of reading practice on speaking ability?” is a cause-effect question. Or in the example given above, intelligence has a positive effect on the stuents’ learning of both History and English. So, the question “what is the effect of intelligence on learning English?” is a causeeffect question. Research Methods in Language Issues …………………………………… 33 Research hypothesis Having formed an appropriate research question, a research hypothesis should be formulated. A hypothesis is a tentative statement about the possible outcome of research. It is a tentative answer to the research question. Hypotheses are of two kinds : non-directional (null) hypotheses and directional (alternative) hypotheses. When no relationship is predicted between the independent and the dependent variables, the hypothesis will be non-directional. A directional hypothesis,on the other hand, predicts a relationship (either positive or negative) between variables. Here is the schematic representation of the classification of research hypotheses: Non-directional (null) Ho Research hypothesis positive directional (alternatnative) H1 negative For several reasons, it is suggested that researchers formulate nondirectional hypotheses whenever possible. First of all, if the hypothesis is directional, there may be ‘researcher bias’, i.e., the researcher may unwittingly tend to support his/her own claim. Second, there is the principle of varifiability. To claim something and then present evidence to support the claim is not always a scietific and logical way of proving things. If something is to be taken as true, there Chapter 3: Research question, research hypothesis and literature review … 34 should be no evidence to the contrary. Suppose someone claims that human beings have only one hand. Suppose further that to support his claim, he names one thousand people each of whom has only one hand. Does this mean that his claim is proven? Not a bit. To prove that human beings have only one hand, there should be no man with two hands. In such cases, a single evidence to the contrary is sufficient to refute the whole claim. Hence, in formulating hypotheses, it is better to state a null hypothesis. If there was not enough evidence to support the null hypothesis, this would automaticaly be taken as evidence supporting the directional hypothsis. Third, non-directional hypotheses require a greater degree of accuracy than directional hypothses. This will be further discussed in later chapters. Based on what was stated, one might jump to the conclusion that directional hypotheses should not be used and then ask, “What is the use of directional hypotheses?” Nondirectional hypotheses should be used whenever possible but not always. Directional hypotheses have their own merits. Consider the relationship between intelligence and language learning. Nobody has ever claimed that there is no relationship between intelligence and language learning. Nor has anybody contended that there is a negative relationship between the two variables. Every normal person agrees that there is a positive relationship between intelligence and language learning. If somebody decides to conduct a study on Research Methods in Language Issues …………………………………… 35 this question, the purpose will be just to determine the extent of the positive relationship not to prove the existence of the relationship. In cases like this, it makes no sense to formulate a null hypothesis. Review of literature Having formed an interesting, relevant, and managable research question, and having formulated an appropriate research hypothesis, the researcher should now comprehensively review all the previous documents and research findings pertinent to the topic under investigation. This process of documenting related materials is called ‘review of the related literature’. The word ‘literature’ refers to all the previous information about a certain topic. Review of literature is done for several reasons including the following: 1. It helps the researcher to put the topic within a scientific framework. 2. It helps the researcher to avoid mere reduplication of previous research. 3. It helps the researcher to avoid inadecquacies of previous research. Chapter 4: Methods of educational research …………………………… Chapter 4 Methods of Educational Research 36 Research Methods in Language Issues …………………………………… 37 Introduction After a comprehensive review of literature and documenting the materials, the next step for the researcher is to choose an appropriate research method. It was mentioned earlier that the term ‘method’ refers to the practical steps and procedures used in research. Depending on the nature of the topic, the researcher may select one of the following research methods: 1. Historical method 2. Descriptive method 3. Experimental method 1. Historical Method Historical method of research is concerned with a systematic way of obtaining, processing, and evaluating data about past events. A number of points need to be clarified about historical method. First of all, historical method of research should not be confused with literature review. In literature review, the researcher collects already existing information about a topic in order to put the topic within a scientific perspective and to consolidate and support a theoretical position. Historical research, on the other hand, is a process in which the researcher forms questions, formulates hypotheses, tests those hypotheses, and supports and rejects them, much like other research methods. Second, since historical research deals with the past events, there is no control over the variables. This Chapter 4: Methods of educational research …………………………… 38 may lead some to believe that historical method is not scientific since too many factors may have influenced an event, all of which were out of the researcher’s control. It needs to be noted, however, that the historical method, like other methods of research, is a scientific and systematic process in which the researcher follows certain steps including these: 1. forming research questions 2. formulating research hypotheses 3. collecting data 4. criticizing the data 5. interpreting the findings The third point to be mentioned about historical research is that since events are not directly observed by researchers, the relevant pieces of information are obtained from some sources. These sources include: 1. Official records including laws, reports, information compiled by universities and other government agencies 2. Non-official records including personal records (like letters, wills, diaries), oral traditions (like folk tales), artistic remains (like paintings, movies), published materials (like books), and mechanical records (such as photographs). 3. Physical remains such as buildings, manuscripts, etc. The sources of information in historical research are generally divided into two types: primary and secondary. Primary sources of information include all the information and documents left behind by actual Research Methods in Language Issues …………………………………… 39 participants or witnesses of an event. Secondary sources of information, on the other hand, include documents and information provided by those who were not present at the scene but somehow obtained the information from other sources, especially primary sources. Needless to say, researchers should make sure that the above-mentioned sources are genuine and truthful before using them as sources of information. They need to scrutinize and criticize the sources of information, especially the secondary sources. This criticism can be of two kinds: external and internal. External criticism deals with determining the authenticity of the document. It asks: ‘Is the document genuine and real or not?’ On the other hand, internal criticism seeks to establish the truthfulness and accuracy of the content of the document. It attempts to answer the question, ‘Is the content of the document true and accurate?’ 2. Descriptive Method Unlike historical method, in which past events are studied, in descriptive method the goal is to describe and interpret the present status of a phenomenon. The descriptive method of research encompasses three major kinds: survey, interrelational, and developmental. Chapter 4: Methods of educational research …………………………… 40 I. Survey Through surveys, researchers gather data by directly giving questions to the participants. That is, using questionnaires, interviews, observations, etc. they obtain the data directly from the respondents. II. Interrelational methods The purpose of interrelational methods, as the name speaks, is to investigate the relationship between and among various factors. There are four kinds of interrelational studies: a. case studies b. field studies c. correlational studies d. causal-comparative studies a. Case studies In case studies, researchers make a thorough and intensive investigation of a case. The case can be an individual or a small social unit. Much like survey methods, in case studies, data are collected on a social unit. However, case studies are different from surveys since in surveys, data are collected on a few factors from a large number of social units, whereas in case studies, many aspects (factors) of a single case (individual or a small group) are investigated. Research Methods in Language Issues …………………………………… 41 b. Field studies A field study is a kind of investigation in which the investigator directly observes a naturally occurring event. That is to say, the researcher does not manipulate or interfere in the event, but simply observes the event as it occurs naturally. When the event to observe is a short one, or when measuring the duration is important, the researcher may decide to observe the event for its entire duration. This kind of sampling is called ‘continuous time sampling’. At other times, when it is not either possible or important to observe the entire duration of an event or behaviour, researchers may opt for observing the event or behaviour only at the end of specific time intervals. This kind of sampling is called ‘time point sampling’. c. Correlational studies Correlation means ‘relationship between two or more factors’. In correlational studies, therefore, researchers seek to investigate the degree of relationship between two or more variables. It was pointed out earlier that in correlational studies, a typical way of asking questions is, “What is the relationship between A and B?” Also remember that correlation refers only to the existence of a relationship not to the existence of a causal relationship. d. Causal – comparative studies Similar to correlational studies, causalcomparative studies are used to determine the extent of Chapter 4: Methods of educational research …………………………… 42 the relationship between and among variables. One difference between the two is that the latter is used to investigate the cause-effect relationship between factors; whereas, the former is used just to find the existence of relationship. Another difference is that while correlational studies typically involve two or more variables and one group, causal – comparative studies usually involve two or more groups with only one independent variable. Moreover, causal – comparative studies are similar to experimental research in that they both involve comparisons, and both establish causal relationships between variables. The difference is that in experimental research, researchers deliberately manipulate variables to create a research situation, while in causal-comparative studies, researchers have no control over the variables. They study events after they have happened to discover the causal relationships between factors. All research methods in which researchers have no control over the variables and cannot manipulate them, and study them after the events have occurred, are known as ex-postfacto. III. Developmental studies Developmental methods of research are those that are used by researchers to study the trend of development of a behaviour or an event over time. Researchers use developmental research to investigate the changes that take place in phenomena in the course of their development over time. For instance, Research Methods in Language Issues …………………………………… 43 investigating language acquisition in children requires that researchers observe the linguistic behaviour of children for a relatively long period of time. This way, they can find out the characteristics of child language at different stages of development. Developmental studies can be conducted in two ways: longitudinal and cross-sectional. In the longitudinal method, the behaviour of one or a few subjects are observed and studied for a long time; whereas in cross-sectional method, a greater number of subjects (at different stages of development with regard to the variable under investigation) are studied for a shorter period of time. In case of the above-mentioned example (child language development), the longitudinal method requires a comprehensive study of a few children for a long time. The cross-sectional method, on the other hand, allows the researcher to select more children at different age groups, and then study them for a much shorter time to learn about the characteristics of their language at different levels of age. 3. Experimental method Historical research is concerned with the study of past events and descriptive methods mainly deal with the description and interpretation of the current status of phenomena. In none of these methods do researchers have any control over the variables. They cannot manipulate factors. In contrast, experimental method requires experimentation. Researchers deliberately Chapter 4: Methods of educational research …………………………… 44 manipulate and control variables and create situations solely for the sake of research. Due to the significance of this method of research and its common use in educational settings, a separate chapter is dedicated to the discussion of the experimental method. In a nutshell, methods of research can be summarized and schematically represented as follows: Methods of research Historical Descriptive Experimental Survey Interrelational Developmental Longitudinal cross-sectional Case Field correlational causal-comparative Research Methods in Language Issues …………………………………… Chapter 5 Experimental Methods of Research 45 Chapter 5: Experimental methods of research …………………………… 46 Introduction It was pointed out in chapter four that historical research studies the past events, and descriptive methods tend to describe the present status of phenomena and investigate the relationship existing between variables. It was also reiterated that none of these methods allow researchers to manipulate variables so as to create conditions for studying the cause-effect relationships between variables. The experimental method of research allows researchers to manipulate factors and exercise their control over variables and the conditions under which they interact, thus enabling researchers to make sound conclusions. Meanwhile, due to certain rigid principles that researchers follow, the experimental method of research enjoys high validity, a concept to be discussed later in this chapter. It needs to be clarified that there are different varieties of the experimental method. The advantages cited above pertain to the true experimental method. Other varieties of experimental method include preexperimental and quasi-experimental methods. It should also be added that the reason why preexperimental and quasi-experimental methods are considered experimental is that they involve experimentation. Nevertheless, it is obvious that they do not enjoy the same merits as the true experimental method. A look at each of these varieties will tell us the reason. Research Methods in Language Issues …………………………………… 47 Pre-experimental method To facilitate the understanding of the preexperimental method, let us begin with an example. Suppose someone has devised a new method of language teaching. The founder of this new method contends that his method – named method A - works quite well, and is superior to the current methods of language teaching. Obviously enough, to accept such a big claim we need evidence, and the person says he has just what we need. He says he has practiced his method in his class and all his students have got such good grades that the average of the class is 19. To summarize, to conduct research, this researcher has selected a group of subjects, taught them in his special method (this is called treatment and represented as X), and has given them a test. Such a design is called one-shot case study and can be schematically represented as : G X T (one-shot case study) Do you accept the researcher’s claim that his method works better than the current methods and should replace them? Of course not. Many factors other than the method of teaching may be responsible for the high average score of the subjects. For one thing, we know nothing about the level of the students knowledge before the treatment was given. No test was administered to measure their entry behaviour. So, one may logically argue that the subjects’ level of Chapter 5: Experimental methods of research …………………………… 48 proficiency could have been high even without instruction. Convinced of the cogency of the argument, the researcher repeats his research. This time, before introducing the treatment, he administers a test (called pre-test). Then the treatment is given followed by another test (post-test). This design is called onegroup-pretest-posttest-design and is represented as : G T1 X T2 Imagine that the average score on the pre-test and the post-test turned out to be 12 and 19, respectively. The researcher now comes to claim more strongly that his method caused the high score on the post-test since the average score on the pre-test was not that high. Are you convinced now that method A is more effective than the current methods of language teaching? Not yet. There are still many problems. The researcher did not compare his method with any other. How can he conclude that his method is more effective than other methods without any comparison? Who knows what results the subjects could achieve if they were taught in any other method? To obviate this problem and to be able to make comparisons, the researcher conducts the same study using two groups. One group of subjects receive the experimental treatment (are instructed via method A), the second group are taught in another method (receive an irrelevant treatment). The second group, receiving the irrelevant treatment is called control group, and the Research Methods in Language Issues …………………………………… 49 irrelevant treatment given to the control group is referred to as placebo and represented as O. After the instructional period, a test is administered to both groups to compare their performance. What happened can be summarized schematically as : G1 X T G2 O T This design is known as intact group design. If the average score of the subjects in the experimental group is 19 and that of the control group 12, the researcher will joyfully shout, “Didn’t I tell you? Not any method can produce such good results as mine.” Well, indeed he compared his method with another method, and those instructed in his method achieved a level of success far better than those in the other group. Now, do you take this as evidence of the superiority of method A over the other method? Only naïve people do. Although there was comparison between the two groups of subjects, was the comparison fair and just? Were the members of the two groups at the same proficiency level before experimentation began? We know nothing about this. Problem again. Adamant to win his case, the researcher duplicates his study with one little difference. This time, he administers a pre-test to both groups before introducing the treatment and placebo. So, the design will be something like this: G1 T1 X T2 G2 T1 O T2 Chapter 5: Experimental methods of research …………………………… 50 This design is referred to as the pretest-posttestcontrol group design. Using this design, the researcher can make sure that the experimental and control group subjects were equal before the experiment. If the entry behaviour (performance on the pre-test) of the two groups was more or less similar but their terminal behaviour (performance on the post-test) radically different, the researcher would assert that the difference is due to the effect of the treatment. Would you agree? A careful person wouldn’t. Despite the fact that the level of the subjects’ knowledge was gauged before and after the treatment and a control group was also used, there is still reason to doubt that the difference between the average scores of the two groups is because of the effect of the treatment. A critic may cogently argue that although the proficiency level of the subjects was equal on the pretest, there may have been other differences between the members of the two groups. One such difference could be in their intelligence. When assigning the subjects to the experimental and control groups, the researcher may have (intentionally or unwittingly) assigned the more intelligent subjects to the experimental group and the less intelligent ones to the control group. In this case, it can be contended that the better performance of the subjects in the experimental group is not necessarily because of the effect of the treatment, and can be attributed to their higher level of intelligence. This problem of selection bias can be solved only if the Research Methods in Language Issues …………………………………… 51 subjects are randomly selected and assigned to the experimental and control groups. To put everything in a nutshell, the preexperimental method of research includes the following four designs: 1. one-shot case study G X T 2. one group-pretest-posttest design G T X T 3. intact group design G1 G2 X O 4. pretest-posttest-control group design G 1 T1 X G 2 T1 O T T T2 T2 Each of these designs is considered a kind of pre-experimental method because it lacks one or more of the characteristics of the true experimental method. In other words, when one (or more) of the requirements of the true experimental method is either deliberately ignored or cannot be met, the method will be preexperimental. What are the characteristics of the true experimental method? The following section will answer this question. True experimental method True experimental method is the most strictly controlled, the most systematic, hence the strongest method of investigating phenomena. From the previous discussions, it can be concluded that the true Chapter 5: Experimental methods of research …………………………… 52 experimental method has a number of requirements. These requirements, which may also be considered the principles of true experimental method, include the following: 1. There should be both pre-test and post-test. 2. There should be an experimental group to receive the treatment and a control group to receive placebo. 3. Subjects should be randomly selected and assigned to the experimental and control groups. If all these requirements are met, research will be truly experimental. If some of these requirements are deliberately ignored, or if they cannot be met, the research method will be pre-experimental. And if some of the requirements of the true experimental research cannot be met, but the researcher attempts to make up for their lack, the method will be quasi-experimental. Quasi-experimental method Sometimes, one of the requirements of the true experimental method is missing, but the researcher tries to compensate for the lack of the missing requirement. In such cases, the method is called quasi-experimental. Suppose a researcher is interested in investigating the effect of a specific method (X) on language learning. For this purpose, he needs two groups of subjects, one to receive the experimental treatment and the other a placebo. Suppose further that the researcher has access to only one group of subjects. There is no group Research Methods in Language Issues …………………………………… 53 available to act as the control group. But the researcher does not wish to carry out a pre-experimental research, so he thinks of a solution. He uses the same group of subjects as both the experimental and control groups. After a pre-test, he begins to introduce his treatment for a week followed by a post-test. The following week, he gives the pre-test again, but this time instead of giving the treatment, he gives a placebo, and then the post-test. During the third week, the treatment is given again preceded by a pre-test and followed by a post-test. This goes on in such a way that the treatment and placebo are presented every other week. Finally, the researcher compares the subjects’ attainment during the treatment and placebo intervals. If the outcome is somehow similar to the following graph, he can conclude (even in the absence of a control group) that the treatment is more effective than the placebo. Figure 5.1 An equivalent time series experiment showing the effect of the treatment 20 15 10 5 0 T1 X T2 T3 O T4 T5 X T6 T7 O T8 Chapter 5: Experimental methods of research …………………………… 54 This is called equivalent time series method. Another common variety of the quasi-experimental method is known as time series method. This method is also used to make up for the lack of control group. The researcher administers several pre-tests to capture the trend of natural growth when there is no treatment (instruction). Then he introduces the treatment and, finally, gives a number of post-tests to get an idea of the pace of progress after the treatment. The comparison of the trend of changes in scores before and after the treatment with that of the treatment period will indicate the extent of the effect of the treatment. Schematically, time series is represented as : T1 T 2 T3 T 4 X T 5 T6 T7 T 8 Figure 5.2 A time series experiment showing the effect of the treatment 18 16 14 12 10 8 6 4 2 0 T1 T2 T3 T4 X T5 T6 T7 T8 Research Methods in Language Issues …………………………………… 55 Validity in research Thanks to its characteristics, the true experimental method produces valid results. Validity in research can be classified into two kinds: internal and external. Internal validity refers to the extent to which the finding of research is the result of the effect of the treatment not any other factor. In other words, internal validity is the degree to which the outcome of research is due to the manipulations imposed by the researcher not any other factor. Thus, if the effect of, say, ‘age’ on language learning is being investigated, the experimental and control group subjects should be equal with regard to other factors that might influence the outcome, such as ‘sex’, ‘linguistic background’, ‘intelligence’, etc. In simple terms, research will be internally valid if all irrelevant factors are controlled and the subjects of both groups are in equal conditions except for the variable under investigation. External validity refers to the generalizability of research findings. It refers to the extent to which the outcome of research can be applied to other similar situations. Chapter 6: Collecting and summarising data ….…………………………… Chapter 6 Collecting and Summarising Data 56 Research Methods in Language Issues …………………………………… 57 Introduction In the previous chapters, it was explained that research begins with research questions, based on which hypotheses are formulated. Then comes the literature review, which is a comprehensive review of all the previous research done in the area under investigation. Next, depending on the kind of research question and the nature of the relationship between variables, an appropriate method should be selected. Having done all these, researchers should now begin to collect data. Data collection To test their hypotheses, researchers need to collect data, and to collect data they need to use some techniques and instruments. The following instruments may be used for data collection. 1. Questionnaires Questionnaires include a set of questions to be answered by the subjects. There are two kinds of questionnaires : open-ended form and closed form. Open-ended questionnaires contain a number of questions which are answered by respondents based on their own feeling. In closed form questionnaires, on the other hand, respondents choose the answer from among a certain number of given choices. There are advantages and disadvantages in both kinds. The major advantage of the closed form questionnaires is that the choices are uniform; hence the Chapter 6: Collecting and summarising data ….…………………………… 58 responses are comparable. The problem is that none of the given choices may actually reflect the true feeling of the respondents. That is to say, when choices are provided, the respondents are deprived of their freedom to respond as they wish. In addition, the researcher’s likes and dislikes may influence the kind of choices provided. It is obvious that the advantages of the closed form questionnaires constitute the disadvantages of the open-ended questionnaires and vice versa. Questionnaires can be distributed either directly or indirectly, i.e., by post. 2. Observation Another way of collecting data is for the researcher to observe a phenomenon, behaviour, or an event as it happens. Observation may be direct or indirect. In direct observation, the researcher uses a carefully prepared checklist to record data. The advantage of this kind of observation is that it helps researchers obtain objective data. The disadvantage (especially when human beings are observed) is that the obtained data may not be natural. In indirect observation, the researcher observes an event or behaviour without letting those involved know that they are being observed. Therefore, the data obtained this way are quite natural. However, indirect observation is less systematic and objective than direct observation. The use of indirect observation also needs to be ethically and morally justified. Research Methods in Language Issues …………………………………… 59 3. Interview Still another way of collecting data is to conduct interviews. Interviews can be conducted in two ways: structured and unstructured. In structured interviews, the interviewer prepares a list of questions in advance. These questions are then posed one by one, and the responses to each question do not influence the choice of the following questions. This kind of interview provides quantifiable, uniform, and comparable data. The problem is that the answers given to earlier questions sometimes make some questions redundant. For instance, if the question “Are you married?” is followed by the question “How many children do you have?”, the second question will be totally irrelevant if the answer to the first question is “No”. This may make the interview somewhat unnatural. Unstructured interview is more flexible, hence more natural than the structured interview. Here, questions are not prepared in advance. Rather, as the interview goes on, relevant questions are posed in a way that the response given to one question leads to another question, and the answer to the second question paves the way for a third, and so forth. The advantage of unstructured interview is that it is more flexible and more natural. The major disadvantage is that the obtained data are not uniform and comparable since every interviewee might be asked a different set of questions. Chapter 6: Collecting and summarising data ….…………………………… 60 4. Inventories Inventories contain a number of statements to which the respondents respond by expressing the degree of their agreement or disagreement. Inventories are usually used to describe the attributes and feelings of the respondents. At the end of each semester, students may be asked to express their feelings towards certain aspects of their courses including the timing, quality of instruction, content, etc. This is an example of an inventory; statements are presented and students express the degree of their satisfaction with each aspect of the course by choosing one of the given alternatives including, for example, ideal, very good, good, average, and poor. Then, each of these alternatives is weighed, that is, assigned a point value. excellent very good good average poor 5 4 3 2 1 Adding the numbers and computing the average point determine the attitude of the students to each item. 5. Tests Tests are among the most commonly used instruments of data collection. Obviously enough, tests should have certain characteristics if they are to be used as data collection instruments. Among those characteristics are validity and reliability. Validity refers to the extent to which a test measures what it is supposed to measure. So a test of mathematics is not valid for gauging the knowledge of English. Reliability has to do with the consistency of scores produced by Research Methods in Language Issues …………………………………… 61 the test. It refers to the extent to which scores are consistent over repeated administrations of the test. 6. Projective measures There are times when, for one reason or another, subjects consciously avoid providing the researcher with honest responses. In such cases, projective measures might be used. Projective measures are measures taken by researchers to get information indirectly out of the subjects without letting them know what they are actually doing. For instance, ambiguous questions are posed so that subjects do not know the purpose of the question. This helps reduce conscious dishonesty. Provided that the use of projective measures is ethically justified, they are very useful for obtaining the true feeling of subjects towards something. Summarising the data The first thing to do after data are collected is to summarise them. A pile of sheets with a score on each is not easily interpretable, hence not very useful. To summarise data, researchers tabulate the data, that is, put the data in a table. Of course, tabulation of data requires that data be coded. In previous chapters, we discussed the measurement scales such as nominal, ordinal, interval, and ratio. Depending on the kind of scale employed, there will be various kinds of data including nominal data, ordinal data, etc. Chapter 6: Collecting and summarising data ….…………………………… 62 If, for example, tests are used as the data collection instrument, the researcher can summarize the data by drawing a table with some columns, using numbers to represent subjects (nominal data) in the first column and listing their scores in the second column. This is much more easily understandable and comparable than having many sheets with only one score on each. To make comparison easier, we can rank the scores from the highest to the lowest or vice versa. Sometimes, more than one subject may obtain a certain score. In such cases, especially when the number of subjects is large, we can summarise the data further by writing each score only once and indicating – in a separate column – its frequency, that is, the number of times that score has appeared in the distribution. Basic computations There are some basic computations that can make data more readily understandable. These computations will also help us learn concepts that will prove quite helpful in the later steps of conducting research. To begin these computations, let us suppose that the following interval data are obtained from a group of 100 subjects and summarised as follows: Research Methods in Language Issues …………………………………… 63 Table 6.1 Score Frequency 20 3 19 5 17 6 16 10 15 25 14 30 12 11 11 5 10 3 8 2 Total 100 In the second column of table 6.1, the frequency of each score is given. This kind of frequency is referred to as simple frequency or absolute frequency. It simply shows how many times a score has occurred in the distribution. Simple frequency may sometimes cause some misunderstanding or misinterpretation. For instance, suppose you are comparing two groups of subjects. In group A, five students have obtained 19 while in group B, fifteen students have got the same score. How would you interpret this? A simple-minded person may quickly conclude that group B is better. But if it turns out later that group A has 15 members and group B consists of 60 members, what then? The decision will be reversed because in group A, five people out of Chapter 6: Collecting and summarising data ….…………………………… 64 fifteen (that is, one third) have got 19; whereas, in group B, 15 out 0f 60 (i.e., one fourth of the group) have obtained the same score. So, simple frequency is insufficient for interpretation. We need to compute relative frequency. Relative frequency shows the frequency of each score in relation to the total number of scores. It indicates how many people have got a score out of how many. To compute relative frequency (rf), absolute frequency is divided by the total number of scores. rf  f N In table 6.1, we can see that three people out of 100 have got 20. So the relative frequency of score 20 is .03. In order to avoid coming up with decimal fractions, we multiply relative frequency by 100. The result is called percentage (P) and shows how many people have got a score in a scale of (out of) 100. P = rf (100) P f (100) N The newly – obtained pieces of information are clearly more meaningful than the original raw data. Table 6.2 contains a summary of the new data. Research Methods in Language Issues …………………………………… 65 Table 6.2 Score f rf P 20 3 .03 3 19 5 .05 5 17 6 .06 6 16 10 .10 10 15 25 .25 25 14 30 .30 30 12 11 .11 11 11 5 .05 5 10 3 .03 3 8 2 .02 2 Despite their value, none of these pieces of information shows the position of a score in the distribution. To know about the position of a score among other scores, we need to compute cumulative frequency. Cumulative frequency (represented as F) is an index indicating how many scores are at the same level and below a given score. To compute cumulative frequency, we start at the bottom of the table and add simple frequencies successively from bottom to top of the table. So, in table 6.2, the cumulative frequency of each score will be computed this way: We start at the bottom row of the table and ask, “How many scores are equal to or below 8?” The answer, according to the table, is 2. Now we move up to the next row and ask the same question, “How many scores are equal to or below 10?” The answer is 5 since three scores are equal to 10 and two scores are below. Chapter 6: Collecting and summarising data ….…………………………… 66 In fact, we obtained 5 by adding the simple frequencies from the bottom upwards. The frequency of the lowest score (8) was 2 and that of the second lowest score(10) is 3; the sum of the two equals 5 (3+2=5). Cumulative frequency has the advantage of showing the rank of each score in the distribution. Yet, like the simple frequency, it is subject to misinterpretation. If you are comparing two scores in two different distributions, and if the cumulative frequency of one score is 10 (that is, the score is better than or equal to 10 scores) and the cumulative frequency of the other score is 20, which score has a better standing? You might say, “Obviously, the score with a cumulative frequency of 20. But again, it all depends on the total number of scores in the distribution. Supposing that the score with the cumulative frequency of 10 belongs to a distribution containing 20 scores and the score with the cumulative frequency of 20 is in a distribution with a total number of 80 scores, it will not be hard to see that the former score has a much better position than the latter. The former is equal to or better than 10 and worse than 10 others; it is somewhere in the middle of the distribution. The latter is better than or equal to 20 but worse than 60 scores; it is somewhere in the bottom quarter of the distribution. To overcome such misunderstandings, we need to make cumulative frequency relative, just as we did with simple frequency. We divide cumulative Research Methods in Language Issues …………………………………… 67 frequency by the total number of scores and call the result relative cumulative frequency. rF  F N Since the result will be a decimal fraction, we multiply relative cumulative frequency by 100, and the outcome is called percentile. Percentile = relative cumulative frequency (100) Score 20 19 17 16 15 14 12 11 10 8 f 3 5 6 10 25 30 11 5 3 2 rf .03 .05 .06 .10 .25 .30 .11 .05 .03 .02 Table 6.3 p 3 5 6 10 25 30 11 5 3 2 F 100 97 92 86 76 51 21 10 5 2 rF 1 .97 .92 .86 .76 .51 .21 .10 .05 .02 Percentile 100 97 92 86 76 51 21 10 5 2 Chapter 7: Displaying and describing the data ….………………………… Chapter 7 Displaying and Describing the Data 68 Research Methods in Language Issues …………………………………… 69 Displaying the data Another step to make data more conspicuous is to display the data graphically. Different kinds of graphs can be used to display data. When the number of scores is limited, bar graphs can be used. In bar graphs, bars are used to represent data. As an example, the simple frequency of the scores in table 6.3 can be graphically represented like this: Figure 7.1 A bar graph 30 25 20 Frequencie 15 s 10 5 0 8 10 11 12 14 15 16 17 19 20 scores In the above figure, the horizontal axis shows the scores, and the vertical axis represents the frequency of the scores. A vertical bar represents the frequency of each score. Instead of bars, it is also possible to mark the top of each score (the intersection of each score and its corresponding frequency), and to connect these points together. The resultant graph is called a frequency polygon (because it usually has several angles). Chapter 7: Displaying and describing the data ….………………………… 70 Figure 7.2 A frequency polygon 35 30 25 Freq. 20 15 10 5 0 8 10 11 12 14 15 16 17 19 20 scores When the number of scores is large, the use of too many bars may make the graph messy and a little confusing. In such cases, instead of using bars to show the frequency of every individual score, the scores are divided into certain intervals, and the frequency of each interval is shown by a column. For instance, if the frequency of scores on a TOEFL test is to be graphically presented, due to the wide range of possible scores, a large number of bars are needed. Under such circumstances, scores are divided into intervals, and columns are used to show the frequency of each interval. This is called a histogram. Research Methods in Language Issues …………………………………… 71 Figure 7.3 A histogram 50 40 Freq. 30 20 10 scores 1 10 20 30 40 50 Frequency Since frequency polygon is one of the most widely used means of the graphic representation of data and since frequent reference will be made to it in the following chapters, a few points need to be explained about it. First of all, as mentioned earlier, it is called a polygon because it usually has several angles. This is quite possible when data are obtained from a small group of subjects. When data are obtained from a large population, the frequency distribution will be something like this: Figure 7.4 A frequency distribution curve Scores Chapter 7: Displaying and describing the data ….………………………… 72 Frequency Second, the above figure is a typical distribution curve, also referred to as a normal distribution curve, and will occur when data are obtained from a large and normal population. The peak (the highest point) of the curve indicates the most frequent score. The score that has the highest frequency in the distribution is called mode. In a normal distribution, the mode is at the centre of the distribution, and the distribution is symmetric, that is, the two sides of the graph are identical in shape. On the other hand, when data are obtained from a specific and small group of subjects, the mode may not be at the centre. For example, when most of the scores in a distribution are high and only a few are low or vice versa, the peak of the distribution may shift toward the right or the left side of the distribution. At such times, we say the distribution is skewed. If a majority of scores are high and the mode is on the right side of the graph, and few low scores are the cause of skewedness, the distribution curve is said to be negatively skewed. Figure 7.5 A negatively skewed distribution Scores Research Methods in Language Issues …………………………………… 73 Frequency Conversely, if most of the scores are low and few positive scores cause the skewedness , the distribution will be positively skewed. Figure 7.6 A positively skewed distribution Scores Finally, it was said that a normal distribution curve has a peak or most frequent score, which is called the mode. In some less typical situations, there might be no peak in the graph. This happens when all scores in the distribution have the same frequency, and no score is more frequent than others. Such a distribution is called a flat distribution. Chapter 7: Displaying and describing the data ….………………………… 74 Frequency Figure 7.7 A flat distribution scores Describing the data Apart from summarising and graphically displaying data, a number of other steps should be taken to make data more conducive to the conduct of research. Altogether these steps are referred to as describing the data. The purpose of these steps is to describe some of the characteristics of distributions of scores and to clarify certain concepts that are both necessary and quite helpful to the conduct of research. In any set of scores, there are minimum and maximum scores, and the rest of the scores are between the two extremes. Obviously enough, neither the maximum nor the minimum score can be representative of that distribution. If we are to use one score to represent the distribution, a more likely candidate is a score that is somewhere at the centre of the distribution. At the same time, in any normal distribution, there is a tendency towards the centre. In other words, a majority of scores tend to be round the centre of the distribution. This is also supported by logic and intuition. Under normal circumstances how many Research Methods in Language Issues …………………………………… 75 people, do you think, can get the perfect score in an examination? Few. How many people get an extremely low score? Again, just a few. Most people get scores between the two extremes. This tendency of scores towards the centre of the distribution is referred to as central tendency. Measures of central tendency There are three measures of central tendency. 1. mode 2. median 3. mean 1. Mode You are already familiar with mode. Mode means fashion. So, mode is the score that is fashionable, that is, the score that appears most frequently in the distribution. In the following set of scores, therefore, the mode is 5. 1, 2, 3, 3, 4, 5, 5, 5, 6, 6, 7, 8, 9, 9, 10 To find mode, all you have to do is to look at the simple frequency of scores. The score with the highest frequency is the mode. Sometimes, two adjacent scores may have the same frequency as in the following set of scores: 1, 2, 3, 4, 5, 5, 6, 6, 7, 8, 9 In the above distribution, both 5 and 6 are the most frequent scores. In such cases, the average of the two Chapter 7: Displaying and describing the data ….………………………… 76 most frequent scores will be the mode. Thus, in the above distribution, the mode is 5.5. There are also times when there are two most frequent scores in the distribution that are not adjacent as in : 1, 2, 2, 3, 3, 3, 4, 5, 6, 6, 6, 7, 8, 8 In this distribution, 3 and 6 have the highest frequency, but they are not adjacent. When this happens, the distribution is said to be bimodal (having two modes). Figure 7.8 A bimodal distribution 2. Median Median is the score at the 50th percentile. It is the score right in the middle of the distribution, dividing the rank of scores into two equal halves so that 50 percent of the scores are above and 50 percent are below the median. When the number of scores in the distribution is odd, the middle score is the median as in the following distribution in which the median is 5: 2, 2, 3, 4, 4, 5, 6, 7, 8, 8, 9 Research Methods in Language Issues …………………………………… 77 When the number of scores in the distribution is even, median is the average of the two scores at the centre of the distribution. In the following set of scores, the median is 5.5 (the average of 5 and 6). 1, 2, 2, 3, 4, 5, 6, 7, 7, 8, 8, 9 3. Mean Mean is the mathematical average of the scores. To obtain mean, all the scores in the distribution are added up and the result is divided by the total number of scores. X X N In this formula, the symbol Σ (sigma) means ‘sum of’, X is used to represent ‘scores’, and N is the number of the scores in the distribution. When there are a large number of scores, or when there are no extreme scores (scores that are radically distant from the rest of the scores) in the distribution, mean is the most reliable measure of central tendency because both mode and median are subject to rapid fluctuations. Measures of variability It was pointed out that there is a tendency towards the centre in any normal distribution. At the same time, in every distribution, there is variability or differences among scores. Variability is every part and parcel of mankind and the nature around him. So, it is quite natural to have differences between and among scores in a distribution. There are three measures of Chapter 7: Displaying and describing the data ….………………………… 78 variability that show how scattered scores are. These measures include: 1. range 2. standard deviation 3. variance 1. Range Range refers to the difference between the highest and the lowest score in a distribution. To obtain range, we simply subtract the minimum score from the maximum score. R = Xmax – Xmin In the following set of scores, range is 15. 2, 3, 4, 6, 9, 11, 12, 13, 13, 15, 17 R = Xmax – Xmin R = 17 – 2 R =15 Although a very simple and easily obtainable measure of variability, range has a major disadvantage: it is sensitive to extreme scores. In other words, it is subject to radical change because of just one extreme score. 2. Standard deviation In any distribution, there is a mean. Some scores are above the mean and some are below. Deviation refers to the difference between each score and the mean. D X X Research Methods in Language Issues …………………………………… Figure 7.10 Frequency Figure 7.9 79 X1 X2 X3 X X4 X5 X6 3 5 7 10 13 15 17 -3 -2 -1 0 1 2 3 Standard deviation is the average of the deviations. So, standard deviation is the average of the differences between each score and the mean. To compute standard deviation, first of all, the mean of the scores is calculated. Then the mean is subtracted from each score. Next the difference between each score and the mean is squared. The squared deviations are then added up and divided by their total number minus one. The square root of the outcome is standard deviation. These steps are summarised in the following formula: ( X  X ) 2 SD  N 1 This formula requires that the difference between each score and the mean be computed separately, every deviation score be squared, the squared deviations be added up, and then divided by N – 1. According to statisticians, standard deviation can also be computed from the raw scores using the following formula: Chapter 7: Displaying and describing the data ….………………………… S 80  X 2  [(  X ) 2 / N ] N 1 In this formula, Σx2 refers to the sum of the squared scores, and means that all scores should be squared and then added up; whereas, (Σx)2 refers to the sum of scores squared, and means that all scores should be added up and the outcome should be squared. 3. Variance Once we have standard deviation, we can easily obtain variance. All we need to do is to square Standard deviation. To compute variance, therefore, the following formula is used: ( X  X ) 2 V N 1 Both standard deviation and variance are among the most versatile statistical concepts that are used in a variety of statistical operations. An example may help to further clarify the way standard deviation and variance are computed. Table 7.1 contains a set of scores ranked from the highest to the lowest. Research Methods in Language Issues …………………………………… X 19 18 17 16 15 15 14 13 12 11 Σx = 150 Table 7.1 X-X 19 – 15 = 4 18 – 15 = 3 17 – 15 = 2 16 – 15 = 1 15 – 15 = 0 15 – 15 = 0 14 – 15 = -1 13 – 15 = -2 12 – 15 = -3 11 – 12 = -4 Σ (X - X ) = 0 81 (X - X )2 16 9 4 1 0 0 1 4 9 16 Σ(X - X )2 = 60 To compute variance, as it was said earlier, we need to calculate the mean. So, the scores are added up (the sum is 150) and the result is divided by the total number of the scores (10). The outcome is 15. Now the mean should be subtracted from each score separately. The second column of table 7.1 contains the deviations. As it can be seen, the total sum of the deviation scores, Σ ( X - X ), always equals zero since the positive and negative values cancel each other out. To avoid this dead end, we square the difference between each score and the mean. The third column in table 7.1 includes the squared deviations. The squared deviations are added up and then divided by N – 1. Here are the computations: Chapter 7: Displaying and describing the data ….………………………… V 82 ( X  X )2 60   6.66 N 1 9 Standard deviation is the positive square root of variance. So: S = √v S = √6.66 S = 2.58 Research Methods in Language Issues …………………………………… Chapter 8 Standardized Scores 83 Chapter 8: Standardized scores ………………….………………………… 84 Introduction In the previous chapters, it was pointed out that to conduct research, researchers begin with research questions, formulate hypotheses, review the related literature, and then gather data to test their hypotheses. After analysing the data, researchers arrive at conclusions and make interpretations. It was also held that raw data are not very interpretable per se. That is why a number of basic statistical operations were presented to make data more easily and consistently interpretable. Due to their nature, raw data are subject to misinterpretation. Consider the following example to see how raw data can be misleading. Suppose Rose and Mary are twin sisters. They are in the same grade but study in different classes. They have taken a test and obtained the following scores: Rose 14 Mary 19 Which one do you think has got a better score? You might be tempted to say, “Well, it is obvious. Mary’s score is much better”. But what if Rose’s score is out of 20 and Mary’s score out of 40? It becomes clear that Rose has got a better score than Mary. Now imagine that in Rose’s class, the exam was very easy and most of the students got better scores than Rose, and the mean score of the class was 17; whereas, in Mary’s class, because of the difficulty of the exam, the mean of the class was 16 and only a few students got better scores than that of Mary. In other words, Rose is quite below the average while Mary is well above the Research Methods in Language Issues …………………………………… 85 average of their classes. Who has got a better score now? You see that our interpretation of Mary’s and Rose’s scores as to which one is better shifts from Mary to Rose, then back to Mary, and so forth. These fluctuations make our interpretations unreliable. We need scores that are understandable and interpretable in a reliable manner without recourse to other pieces of information. Such scores are called standardized scores or simply standard scores. Z-score and T-score are two examples of such scores. Before talking about standard scores, however, a few points need to be explained about the normal distribution curve. Frequency The normal distribution curve In chapter 7, reference was made to the normal distribution curve. It was mentioned that with large data, the frequency distribution curve will look something like the following figure: Figure 8.1 A normal distribution curve Scores Chapter 8: Standardized scores ………………….………………………… 86 The normal distribution curve, also referred to as the bell-shaped curve, has four characteristics. First of all, it is unimodal, that is, it has only one mode or most frequent score. So, a bimodal distribution is not a normal distribution. Second, it is symmetric. This means that the mode is at the centre of the distribution, and the two sides of the distribution above and below the mean are identical. In other words, the shape of the distribution curve above the mean is exactly like that of below the mean. If you fold the distribution curve, the two sides of the distribution cover each other up perfectly. The third characteristic is that since the distribution is symmetric, mode, median, and mean are all the same and equal in value. The fourth characteristic is that the normal distribution is asymptotic, that is, the tails of the curve never meet the horizontal line. It implies that the probability of no score is zero. Apart from the above-mentioned characteristics, the normal distribution has a general characteristic, which was discussed in the previous chapters: central tendency. Namely, a majority of the scores tend to be round the mean. The more we move away from the mean, the fewer the scores become. But to what extent are the scores centred round the mean? On the basis of the assumptions underlying the normal distribution curve, and making use of the concept of standard deviation, statisticians have been able to prepare tables representing the proportion of area under the normal curve. Using standard deviation as the yardstick, these Research Methods in Language Issues …………………………………… 87 -3 -2 –1 X Scores 1 .0.228 .1359 .3413 Frequency tables show what percentage of scores fall between the mean and a given score depending on how many standard deviations that score is away from the mean. For instance, according to statisticians, almost 34% of the scores in a normal distribution are between the mean and one standard deviation away from the mean (either above or below). So, if a score is one standard deviation above the mean, it means that it is better than about 84% of the scores in the distribution. Because according to what was said, around 34% of the scores are between the mean and that score. We also know that 50% of the scores fall below the mean (because the distribution is symmetric). So, the score in question is better than around 84 percent of the scores in the distribution. Figure 8.2 2 3 Chapter 8: Standardized scores ………………….………………………… 88 Standardized scores At the beginning of this chapter, it was made clear that raw scores are not fully interpretable per se because to know the real value of a score and its position among other scores, we need other pieces of information. Meanwhile, we learnt above that if we knew the distance between each score and the mean in terms of standard deviation, we could know the position of that score in the distribution. To put everything in a nutshell, to know about the position of scores in a distribution, instead of raw scores, we need to represent scores in a way that indicates how many standard deviations that score is either above or below the mean. To this end, Z-score is just what we need. Z-score is a standard score indicating how many standard deviations a given score is from the mean. It shows the difference between a given score and the mean in terms of standard deviation. The Z-formula is as follows: Z XX S Supposing that the mean score of a class is 15 and standard deviation is 2, if one of the students of the class has got 17, we can turn the raw score into Z-score this way: s=2 x = 17 Z =? X = 15 Z X X S Z  17  15 1 2 Research Methods in Language Issues …………………………………… 89 This means that the student with a raw score of 17 has a percentile rank of 84, that is, s/he is better than about 84 percent of the students in the class. Once we have the Z-score, the raw score can be easily computed provided that the mean and standard deviation are known. To obtain the raw score from the Z-score, the Z-formula can be turned into the following formula: Z XX S X  Z (S )  X For example, in a class with a mean of 14 and standard deviation of 1.5, the raw score of a student with a Zscore of –1 is calculated as follows: X  Z (S )  X x = -1(1.5) + 14 x = 14 – 1.5 x = 12.5 A couple of important points need clarification here. The first point is that if the Z-score is a whole number, the case is relatively straightforward. Figure 8.2 tells us what proportion of scores fall between mean and the Z-scores of 1, 2, and 3 (as well as –1, -2, and – 3). What if the distance between a score and the mean is not a whole number but a fraction of standard deviation? Consider the following example: X = 12 s = 1.5 x = 11 Chapter 8: Standardized scores ………………….………………………… 90 Z=? Z X X S Z  11  12  1   .66 1.5 1.5 In this example, the score 11 is 0.66 of a standard deviation below the mean. Earlier, it was mentioned that statisticians have prepared tables showing the proportion of area under the normal curve. These tables indicate the proportion of scores between any Z (whole number or fraction) and the mean. A copy of one such table showing these proportions is given in appendix 1. Using the table in appendix 1, we can see that the proportion of scores between mean and the Zscore of -0.66 is .2454. This shows that 24.54 percent of the scores are between our Z-score and the mean. You remember that 50 percent are above the mean. So, altogether, around 74.54 percent (24.54 %+ 50%) of the scores are better than the score in question, and the remaining 25.46 % are worse than it. So, the percentile rank of the student with such a score is 25.46. Another point to clarify is that when Z-score is negative, the proportion of area under the normal curve will be the same as those for the positive values of Z. Only the interpretation differs. For instance, when the Z-score is –1, it means that the score is one standard deviation below the mean. The proportion of scores between that score and the mean is .3413, but since the score is below the mean, it is worse than 84.13% (34.13% + 50%) of the scores. Research Methods in Language Issues …………………………………… 91 One little problem with Z-score is its low magnitude. A student with a Z-score of +1 is better than about 84% of the class members, yet his/her Z-score is only one. To obviate this, in the following formula, some fixed values can be conventionally used for mean and standard deviation to increase the magnitude. X  Z (S )  X In Z-score, the mean is considered to be zero and the standard deviation is supposed to be one. In T-score, which is another standard score, mean and standard deviation are conventionally considered to be 50 and 10, respectively. Thus the T-formula is: T = Z (10) + 50 Therefore, a Z-score of +2 is equal to a T-score of 70. T = 2(10) + 50 T = 70 Chapter 9: Probability estimation and hypothesis testing …………………. Chapter 9 Probability Estimation and Hypothesis Testing 92 Research Methods in Language Issues …………………………………… 93 Introduction In chapter 8, the concept of Z-score was explained, and its advantages over raw scores were discussed. Apart from this, Z-score has a significant application in inferential statistics (in hypothesis testing and probability estimation). There are two kinds of statistics: descriptive and inferential. In descriptive statistics, the goal is to summarize and describe data obtained from a group of subjects in order to learn about the characteristics of that group of subjects. An example would be to describe and summarize data pertaining to characteristics (such as age, linguistic background, marital status, etc.) of the students of a university. On the other hand, most of the time, researchers are not really interested in discovering characteristics of a limited group of subjects. Even when studying a small group of subjects (called a sample) selected from a large group (referred to as population), their aim is to learn about the characteristics of the population, not just the sample itself. For example, if a researcher intends to investigate the effect of age on language learning, although for manageability reasons s/he usually selects a limited group of subjects, his/her ultimate aim is to generalize the findings to the whole population of young and old learners. In other words, in inferential statistics, researchers describe and summarize data obtained from a sample to learn about one or more characteristics of the sample. Any characteristic of a sample is referred to as a statistic. Chapter 9: Probability estimation and hypothesis testing …………………. 94 Then, they generalize their finding and make inferences about the characteristic of the population, which is called parameter. In short, descriptive statistics studies the characteristics of a sample (statistic); whereas, inferential statistics aims at discovering the characteristics of the population (parameter). One can conclude, therefore, that any characteristic described through descriptive statistics is a statistic, and any characteristic described and inferred though inferential statistics is a parameter. Inferential statistics, by its very nature, is based on probability. Because we do not study the whole population, we can never be 100 percent sure about the characteristic of the population. Rather, we only guess a parameter based on our knowledge of a statistic. The Z-formula has applications in inferential statistics. Using the Z-formula, we can estimate the probability of an individual score belonging to a sample as well as the probability of a sample belonging to a population. Thus, the concept of Z can be very helpful in testing hypotheses. Probability Probability is defined as the proportion of desired event to the total number of possible outcome multiplied by one hundred. Number of desired events Probability (100) Possible outcome Research Methods in Language Issues …………………………………… 95 As an example, consider a coin. A coin has two sides: heads and tails. If you toss a coin, the probability of the coin coming down heads up is 50% because the number of desired event is one and the total number of possible outcome is two. Probability = ½ (100) = 50 % Or in a multiple-choice item with four alternatives, the probability of getting the answer right without any knowledge is 25%. Probability = ¼ (100) = 25% But getting at the probability of an event is not always so straightforward. Think of a class with a mean of 15 and standard deviation of 2. What is the probability of score 13 existing in such a class? In cases like this, certain computations are needed. It is here that the Z-formula can be made use of. The probability of a score belonging to a distribution Using the Z-formula and the table of the proportion of area under the normal curve (appendix 1), we can estimate the probability of a score belonging to a distribution if we have only the mean and standard deviation of the distribution. In the above-mentioned example, the mean of the distribution was 15 and standard deviation was 2. X = 15 S=2 x = 13 Probability of x = ? Chapter 9: Probability estimation and hypothesis testing …………………. 96 To compute the probability of the score 13 belonging to the distribution with the above-mentioned characteristics, the first step to take is to convert this raw score into the standard Z-score. The corresponding Z-value of 13 is –1 because: Z XX S Z 13  15  2   1 2 2 -3 -2 –1 34.13 % 15.78 % Frequency Figure 9.1 X 1 2 3 Scores The moment we notice the Z-score is negative, we know that such a score (13) is below the mean. So, it cannot be in the positive side of the distribution. If such a score is to be found in the distribution, it has to be in the negative side of it. Since the normal distribution is symmetric, we are sure that 13 cannot be among the 50% of the scores that are above the mean. On the other hand, appendix 1 tells us that 34.13% of Research Methods in Language Issues …………………………………… 97 the scores are between the mean and the Z of 1. So 13 cannot be there. There remains the area beyond the Zscore of 1. According to appendix 1, the area beyond Z –f 1 is .1587. Therefore, if 13 is to be found in the distribution, it has to be in the area beyond the Z-score of 1. This means that the probability of score 13 in a distribution with the given characteristics is 15.87%. Let us have another example. What is the probability of score 16 in a distribution with a mean of 13 and standard deviation of 2? X = 13 S=2 X = 16 Z Probability of x =? Z = 1.5 Figure 9.2 6.68 % 16  13 2 43.32 % Z XX S X 1.5 Chapter 9: Probability estimation and hypothesis testing …………………. 98 Appendix 1 tells us that the proportion of area between the mean and the Z-score of 1.5 is .4332 and the area beyond Z is .0668. Based on what was said earlier, we know that score 16 should be in the positive half of the distribution because the Z is positive. Out of the 50% of scores in the positive side of the distribution, 43.32 % are between the mean and the Z-score of 1.5. So, the probability of 16 belonging to such a distribution is the remaining 6.68% (the area beyond). The probability of a mean score belonging to a population The Z-formula can also be used to estimate the probability of a mean score belonging to a population. An example of a situation that requires the estimation of such probability would be as follows: Suppose you have a group of 49 university students. You intend to teach them general English in a newly devised method (called A), which you guess is more effective than other current methods. After doing all the prerequisites, you introduce the treatment (instruction in method A) and then give your subjects a test. Suppose further that the mean score of your class on a proficiency test such as TOEFL turns out to be 510. Now, you want to compare your class with the whole population of Iranian university students who are currently taking their general English course. Given that you know the grand average of the population (represented as μ) on the proficiency test is 500, you might be tempted to conclude that your group of Research Methods in Language Issues …………………………………… 99 subjects has outperformed the population, and that your method has been effective. However, great care should be exercised in such cases. The sheer fact that the mean of your group is above the population mean does not in any way guarantee that the treatment has been effective. Remember that variability is one of the characteristics of any normal distribution. In other words, in any normal distribution, there are scores above and below the mean. This means that although your sample mean is higher than the population mean, the sample may still belong to the normal distribution. In that case, you cannot claim that your treatment is effective, because there will be no meaningful difference between the sample mean and the population mean. To see if the sample mean belongs to the distribution of means in the population, the following formula is used: ZX  X  SX Notice that this formula is the same as the Z-formula we had earlier. Z XX S In the new formula, instead of having individuals, we have groups (samples). So, instead of single scores, we have a distribution of means. These mean scores have a Chapter 9: Probability estimation and hypothesis testing …………………. 100 grand average (the mean of means) that is represented as μ. Figure 9.3 X1 X 2 X 3  X 4 X 5 X 6 In other words, in the new formula, X stands for the sample mean, μ represents the population mean, and S X represents the standard deviation of means. Standard deviation of means poses a problem. To compute it, you need to have all the sample means in the population, and then apply the standard deviation formula. The problem is that you have only the mean of your sample and the mean of the population. You have no clue as to what other mean scores are. So, you cannot compute S X . Still, we know that standard deviation is inversely proportional to the number of scores in a distribution. Statisticians contend that the standard deviation of population can be estimated from the sample mean using the following formula: Research Methods in Language Issues …………………………………… SX  101 SX n Now you have every thing you need to apply the new Z-formula to test the effect of your new method (A). Here are the summary of the data and the calculations: S = 35 X = 510 N = 49 μ = 500 X  ZX  SX  SX SX  35 5 n 49 510  500 ZX  2 5 Figure 9.4 47.42% μ 2.28% 2 The Z-table in appendix 1 indicates that the percentage of scores between mean and the Z-score of 2 is 47.72%. So you learn that your sample mean is neither in the negative half of the distribution nor in the 47.72% of scores between the mean and the Z of 2. In other words, you are 97.72 percent sure that such a mean (510) does Chapter 9: Probability estimation and hypothesis testing …………………. 102 not belong to this population. The probability of your mean belonging to the population is 2.28%. What does this mean? Does it mean that your group does not belong to the population and is better than it? Not necessarily. Remember that there is 2.28% chance that your group is not different from the population and belongs to the normal population. On the other hand, as it was said earlier, in inferential statistics, nothing can be proven with 100 percent certainty; there is always room to make mistakes. How should this result be interpreted then? It depends on the extent of the error that can be tolerated. The extent of the mistake to be tolerated depends, in turn, on the importance and significance of research and its effect on mankind. In medical research, for example, one has to be much more sensitive to mistakes than in, say, language research. In medical research, the researcher cannot conclude that a certain drug is effective for a certain illness if s/he is only 97 percent sure. There is 3 percent chance that it might endanger the health or even the life of people. So, in areas like medical research, researchers need to exercise a lot of care. In language issues, researchers have agreed on two levels of mistake to be tolerated: 1% and 5%. The extent of the possible mistake to be tolerated is referred to as the level of significance and is represented as α. Level of significance is expressed in terms of the proportion of area under the normal curve. Hence, the level of significance corresponding to one percent mistake is .01 and that of 5% mistake is .05. Research Methods in Language Issues …………………………………… 103 When formulating research hypotheses, researchers should decide on the level of significance of their study based on the importance it has. It is important to remember that they cannot change the level of significance later. In the above–mentioned example, if you chose the .05 level of significance (5% mistake), you could now safely conclude that your group is significantly different from the normal population and does not belong to it. Whereas, if you chose the .01 level of significance (1% mistake), you couldn’t make such a claim because you allowed for only one percent of possible mistake, but now you have 2.28 % of possible mistake. Testing Hypotheses In chapter 3, it was explained that research hypotheses are either directional or non-directional (null). It was also explained that when a directional hypothesis is formulated, the researcher is certain about the direction of the relationship between two variables; only the extent of the relationship is to be determined. For instance, when a researcher hypothesizes that there is a positive relationship between intelligence and language learning, s/he has nothing to do with the negative side of the distribution. Rather, s/he only wants to locate the position of his/her experimental group in the positive side of the normal distribution. Chapter 9: Probability estimation and hypothesis testing …………………. 104 Figure 9.5 A one-tailed distribution 5% Such a distribution is one-tailed. Supposing that the level of significance is .05, all of it falls on one side of the distribution. On the other hand, when the hypothesis is non-directional, the distribution is twotailed. Figure 9.6 A two-tailed distribution 2.5% 2.5% In a two-tailed distribution, because the direction of the relationship cannot be predicted (it may be both positive and negative), the extent of the tolerated error Research Methods in Language Issues …………………………………… 105 (level of significance) should be equally divided between the two tails. Thus, if the level of significance is .05, only 2.5% of possible mistake can be tolerated on either tail of the distribution instead of 5%. In chapter three, it was held that for a number of reasons, the null hypothesis should be used whenever possible. Now you can easily understand the last reason. In non-directional hypotheses, the extent of the possible mistake is actually half that of directional hypotheses. To summarize, in the early stages of conducting research, when formulating hypotheses, you should decide on the level of significance and stick to it throughout your research. You should also take into account the kind of the research hypothesis. If the hypothesis is directional, the entire possible mistake is considered in one tail of the distribution; if the hypothesis is non-directional, the extent of the tolerable mistake is reduced to half. Next, you should find – in the Z-table – the value you need to claim that the sample (the experimental group) is meaningfully and significantly different from the population and does not belong to it. This Z-value is called the critical Z-value. For instance, if you have a directional hypothesis and the level of significance is .05, you must be 95% certain in order to prove your claim. This means that the area beyond Z in the Z-table should not be more than .05. The Z-value corresponding to .05 level of significance in case of the directional hypothesis is 1.65. Your observed Z value must be equal to the critical Z value Chapter 9: Probability estimation and hypothesis testing …………………. 106 or exceed it so that you can prove your claim. An example may help clarify the point. Example To study the effect of a treatment (A) on language learning of a group 36 subjects, having observed all other requirements, you have administered a post-test and the following data have been obtained. Supposing that the population mean is 150, test your null hypothesis at .01 level of significance. s = 18 X = 159 N = 36 µ = 150 α = .01 hypothesis = non-directional Prior to doing any computations, let us determine the critical Z value. Since the hypothesis is non-directional, we should halve the extent of mistake. Namely, we can have only half a percent of mistake. Hence, the area beyond Z should be .005. Using the Ztable in appendix 1, we find the Z-value corresponding to the above index. According to the table, the critical Z value is 2.58. Now we compute the observed Z. If it is equal to or greater than the critical Z-value, the null hypothesis is rejected and the treatment is effective; otherwise, the null hypothesis is supported and the treatment has no significant effect. SX  18 3 36 159  150 ZX  3 3 Research Methods in Language Issues …………………………………… 107 Since the observed Z-value exceeds the critical Z value, the null hypothesis is rejected, and it is statistically proven that the treatment is effective. Chapter 10: Comparing two means (T-test) ………….…………………. Chapter 10 Comparing Two Means (T-test) 108 Research Methods in Language Issues …………………………………… 109 Introduction In chapter 9, we discussed the use of Z in comparing a sample mean with a population mean. The Z procedure is used when there is only one sample group. More commonly, however, researchers use more than one group. Remember that one of the characteristics or principles of the true experimental research is the existence of experimental and control groups. So, there are times when researchers select two sample groups belonging to the same population, use one as the experimental group and the other as the control group, and then compare the mean scores of the two samples. In such situations, T-test is used. Comparing two sample means Suppose we have decided to conduct a research on the effect of sex on the Iranian university students’ language learning. For this purpose, we obviously need two groups randomly selected from the population: a group of male subjects and a group of female subjects. We give both groups instruction for a certain period of time and under identical circumstances. Finally, we give a post-test to both groups and obtain the raw data. The procedure to compare the two sample means is much like the Z procedure. Suppose that the obtained raw data are as follows: G1 = female G2 = male n = 25 n = 25 s=3 s=4 X = 17.5 X = 16 Chapter 10: Comparing two means (T-test) ………….…………………. α = .05 110 hypothesis = null To see if there was a statistically significant difference between a sample mean and a population mean, we used the following formula: X  ZX  SX You remember that S X refers to the standard deviation of the means and is estimated this way: SX  SX n The gist of the mater is that to check the difference between the two means, in the previous chapter, we subtracted one of the means from the other and divided the result by the standard deviation of means. More or less the same thing is done in T-test. The formula to be used here is: tobs  X1  X 2 S ( X 1  X 2) In this formula, S ( X  X ) refers to the standard deviation of the difference between the means and is computed using the following formula: 1 2 2  S   S  S ( X 1 X )   1    2  2  n1   n2  2 Research Methods in Language Issues …………………………………… 111 In the case of our example data, the computations are as follows: 2 2  S   S  S ( X 1 X 2 )   1    2  =  n1   n2  2 2  3   4    S ( X 1 X 2 )    =  25   25  9 16 = 1 =1  25 25 Once we have the denominator, we can put it in the t formula and obtain the observed t value: t obs  X 1  X 2 17.5  16   1.5 S ( X1  X 2 ) 1 Now that we have the observed t value, we should check it against the critical t value. Just like the Z procedure if the observed t value turns out to be equal to or greater than the critical t value, the difference between the two means is said to be statistically significant. Then the treatment is effective and the null hypothesis is rejected. On the other hand, if the observed t value is smaller than the critical t value, the null hypothesis is supported and the treatment is said to have no significant effect. But how is the critical t value obtained? Chapter 10: Comparing two means (T-test) ………….…………………. 112 The critical t value Obtaining the critical t value is a little different from obtaining the critical Z value. In the Z-procedure, we just decided on the level of significance, and depending on whether the hypothesis was directional or non-directional, we found the Z value corresponding to the proportion of area beyond Z. In the T-test procedure, another table is used (see appendix 2). The table of critical t value has two dimensions. The first dimension indicates the different levels of significance for one-tailed tests (directional hypotheses) and twotailed tests (non-directional hypotheses). The second dimension shows the degrees of freedom, which will be explained below. The intersection of the level of significance and the degree of freedom shows the corresponding critical value of t. Degree of freedom refers to the number of quantities that can be freely determined. In the following equation, for instance, there are three parameters: A+B=C Out of the three parameters, however, only two can be freely chosen. Once we have assigned values to A and B (e.g., A = 3, B = 5), the value of C is already determined. It has to be 8, and we are not free to choose a value for it. Similarly, in the following equation, we are free to choose values for only three parameters out of four: A+B–C=D Research Methods in Language Issues …………………………………… 113 Basically, out of three parameters, our degree of freedom is two; out of 4 parameters, it is 3.Out of N parameters, the degree of freedom is N-1. To determine the degree of freedom, we apply N – 1 to each group of subjects. In our example case, each group included 25 members. So the degree of freedom in each group would be 24 (25 – 1). Since we had two groups, our total degree of freedom would be 48. If there were no such number in the t-table (appendix 2), then the closest number to it would be the degree of freedom. With these pieces of information, the critical tvalue can be easily found. As for our example, considering the column showing the .05 level of significance for two-tailed test and taking into account the degree of freedom (48), we can see that the critical t value is 2.021. Now we compare the observed t value with the critical t value. The observed t value (1.5) is smaller than the critical t value, suggesting that the difference between the two groups is not statistically significant, and that the treatment is not effective and the null hypothesis is supported. Let us summarise these in another example. Example: To test the effect of a treatment, you have given a post-test to two groups. The following data are obtained. Supposing that every thing is all right, test your null hypothesis at .05 level of significance. Chapter 10: Comparing two means (T-test) ………….…………………. G1 n = 36 s = 24 X = 150 α = .05 114 G2 n = 25 s = 15 X = 140 hypothesis = null Before doing the computations, let us determine the critical t value. The degree of freedom is 35 + 24 = 59. The closest number to this in the t-table is 60. The critical t-value corresponding to 60 at .05 level of significance in the two-tailed test is 2.00. tcrit = 2.00 2 2  24   15  S( X1X 2 )     (4) 2  (3) 2  25  5    36   25  150  140 tobs  2 5 Since the observed value is equal to the critical value, we can statistically reject the null hypothesis and conclude that the treatment is effective. Matched t-test In the preceding section, the means of two independent samples were compared. For this reason, the t-test procedure described above is also known as the independent t-test. There are times when researchers need to compare two means obtained from a single sample. When scores on two different variables are obtained from a single group, matched t-test is used. For example, the researcher may give a pre-test and a post- Research Methods in Language Issues …………………………………… 115 test and hope to be to compare the two means. Or one may give the subjects two different tasks and want to compare their performance on the tasks. To conduct such a study, only one group of subjects will be sufficient. The procedure is more or less the same as the independent samples t-test, but the formula is a bit different. In matched t-test, the following formula is used: tobs  X1  X 2 SD In this formula, the numerator is the difference between the two means. The denominator S D stands for the standard deviation of the differences between scores. To compute S D , first of all, the difference between every pair of scores should be calculated. The difference between each pair of scores (x1 – x2) is called deviation score and represented as D. Then, the standard deviation of the deviation scores should be computed (SD). Finally, the outcome should be adjusted for the sample size (that is, made sensitive to N) to have an estimate of S D . To clarify the point further, let us do the computations with an example. Suppose that the scores of the above-mentioned subjects on the Pre-test and the post-test are those presented in the following table: Chapter 10: Comparing two means (T-test) ………….…………………. 116 Table 10.1 S = subjects X1 = scores on the Pre-test X2 = scores on the Post-test D = deviation score (X1 – X2) D2 = squared deviation (X1 – X2)2 S 1 2 3 4 5 6 7 8 9 10 Σ X1 17 14 18 11 15 17 12 16 18 12 150 X2 16 17 15 14 10 15 9 13 14 12 135 D 1 -3 3 -3 5 2 3 3 4 0 15 D2 1 9 9 9 25 4 9 9 16 0 91 With these pieces of information, let us test our null hypothesis at .05 level of significance. As usual, we are to determine the critical value of t as a first step. The degree of freedom is 9 (10 – 1). The critical value of t with this degree of freedom at .05 level of significance is 2.26 (see appendix 2). To compute the observed t value, the first thing to do is to compute the standard deviation of the deviation scores. We have already learned (in chapter 7) that standard deviation is computed from raw scores using the following formula: S  X 2  [(  X ) 2 / N ] N 1 If we substitute D for X in the formula, it will be : Research Methods in Language Issues …………………………………… SD  117  D 2  [(  D) 2 / N ] N 1 So SD is computed as : SD  91  (225 / 10) 68.5   2.75 9 9 This SD should be adjusted for the sample size. Just like the way S X was estimated from Sx , so is S D estimated from SD. SX  SD  SX n 2.75 10 SD   SD n 2.75  .87 3.16 The last step is to compute the observed value of t: t obs  X 1  X 2 15  13.5   1.72 SD .87 Since the observed value of t is smaller than the critical value, we cannot reject the null hypothesis, which means that the difference between the two means is not statistically significant. Assumptions underlying t-test Although t-test is a very useful and versatile statistical procedure, care must be taken in the use of it. There are certain assumptions underlying t-test that should be met before using it; otherwise, there may be confusion. These assumptions include the following: Chapter 10: Comparing two means (T-test) ………….…………………. 118 1. The scores are measured on an interval scale. Namely, t-test is not used to compare data obtained on ordinal or nominal scales. 2. In the independent group t-test, every subject should be assigned to only one group. No subject should be a member of both experimental and control groups. 3. Every subject’s score must be independent of any other subject’s score. 4. Scores should be approximately normally distributed, and the variances of the groups should not be significantly different from each other. 5. Finally, the most important assumption underlying t-test is that it is ideally used to compare the means of only two groups. The last assumption says that t-test is not an appropriate statistical procedure for comparing the means of more than two groups. The reason is that when the number of comparisons increases, so does the probability of making mistakes (that is, the level of significance). Statisticians say that level of significance (α) changes according to the following formula: α = 1 – (1- α )c In this formula, c stands for the number of comparisons. If two means are compared, only one comparison can be made, and the level of significance remains intact. Supposing that the original level of Research Methods in Language Issues …………………………………… 119 significance was .05, with one comparison, it would not change. α = 1 – (1 - .05)1 α = 1 - .95 = .05 But if three means are compared, three comparisons are possible. Then, the original .05 level of significance will change this way: α = 1 – (1 - .05)3 α = 1 – (.95)3 α = 1 - .85 = .15 This means that although only 5% of possible mistake is allowed, the probability of mistakes is now 15%. Chapter 11: Reporting research findings …………….…………………. 120 Chapter 11 Reporting Research Findings Research Methods in Language Issues …………………………………… 121 Introduction In the previous chapters, we discussed how to form research questions; formulate hypotheses; review the literature; collect data; summarise, describe, and analyse the data; do certain statistical operations and come up with certain results. The purpose of this concluding chapter is to discuss the way the research findings should be reported. In order to make research findings understandable and useful for others, all researchers need to follow a common format in reporting their findings. In this chapter, one such format is to be described. Research report format There are different ways of reporting research findings. Researchers in different fields may use different formats for reporting their findings. So, the easiest way to find out how a research paper should be prepared is to check the major journals to which members of that field subscribe. Nevertheless, since most papers in applied linguistics and language teaching use the APA (American Psychological Association) format, this format will be briefly described below. Within the APA format, there are three major sections in any research report: introduction, method, and results. Before these main sections, however, certain preliminaries should be considered. Prior to anything else, the title of research and the researcher’s name and affiliation should be given as follows: Chapter 11: Reporting research findings …………….…………………. 122 Title (concise and exact, capitalize major words) Researcher’s Name (double space below title, capitalize first letters) Researcher’s Affiliation (University, organization, etc.) (double space below name, first letters in caps.) This is followed by an abstract. An abstract is a summary of research in approximately 150 words, which states research question, method, and results. Abstract is written in the form of a single paragraph the purpose of which is to let those who read it decide whether or not they really want to read the whole report. This is followed by introduction, which although not labelled, is the first major section. The first part of introduction contains the background to the research topic, research questions, the purpose of research, and the significance of the study. It should be noted that here the research question is introduced generally, but not necessarily as a formal question. The second part of the introduction section is labelled ‘review of related literature’, which is usually a side heading. But if there is not much related literature, the heading may be omitted. Review of related literature deals with what other people have already done. After the review of literature, the researcher Research Methods in Language Issues …………………………………… 123 should clearly state the research question and formulate hypotheses. In short, the introduction section aims at answering four basic questions: 1. What do you intend to do? 2. What are your predictions? 3. Why is the work important? 4. What has already been done? Method The second major part of research report is the method section, which is centred and the first letter is capitalised. The method section includes the following parts: Subjects In the ‘subjects’ part, which is labelled as a single heading, the number of subjects is given and their characteristics are described. Depending on the topic of research, these characteristics may include age, sex, first language background, level of education, the number of groups they were assigned to, the criteria for their grouping, and so on. Materials and procedures Sometimes ‘materials’ and ‘procedures’ are two separate parts, each one being labelled as an independent side heading, and sometimes they form a single heading. In either case, the description of the materials used in the research comes first. For example, Chapter 11: Reporting research findings …………….…………………. 124 if the treatment includes some teaching materials, then the name of the source(s), number of chapters, units, lessons, even pages, and the type of classroom activities should be described. Or if tests are used for data collection, the characteristics of tests including the number of subtests, the number of items in each subtest, the proportion of items testing the various aspects of a characteristic, etc. should be clarified. ‘Procedure’ follows the description of ‘materials. Here, the researcher gives a concise and detailed description of the way data were collected. In this part of the method section, the researcher explains how s/he administered the test (orally, in written form, etc.); whether or not s/he answered questions if there were any; in what language instructions were given – native or target; how long data collection took; what the scoring procedure was, and so on. In short, in the ‘procedure’ section, the researcher gives a step by step description of whatever was done so that if somebody wishes to replicate the research, they know what to do. Data Analysis The last part of the method section, which begins with a side heading with the initial letters capitalized, is ‘data analysis’. Data analysis explains what was done with the data after they were collected. For example, the researcher explains what statistical operations and tests, what formulae, etc. were used to analyse the data. Research Methods in Language Issues …………………………………… 125 Results The results section is also centred and capitalized. It is made up of two parts. Sometimes the first part is referred to as ‘findings’ and the second part as ‘discussion’. At other times, the first part may be labelled ‘results’ and the second part ‘conclusion and discussion’. Names do not really matter. What matters is that the first part only gives the outcome of the statistical operations done. It shows, for example, what the observed value of t and the critical t value turned out to be(supposing that t-test was used). The second part, the ‘discussion’ part, gives an interpretation of the findings. It explains what the obtained results mean, and what implications (theoretical or practical) they may have. References Finally, the last part of a research paper is the reference section, in which all the books and other sources of information used are listed as follows: Author’s last name, author’s first name, author’s middle name (usually initialised). Date of publication. The title of the book (usually italicised or underlined). Place of publication : name of the publisher. Here is an example: Hadley, Alice,O. 2003. Teaching language in context. Boston, Massachusetts: HEINLE & Heinle Publishers. Chapter 11: Reporting research findings …………….…………………. 126 References for further reading Farhady, H. 1995. Research methods in applied linguistics. Tehran: Payame Noor University Press. Gravetter, F. J. & L.B. Wallnau. 1996. Statistics for behavioural sciences: A first course for students of psychology and education. 4th ed. St. Paul: West. Hatch, E. & H. Farhady. 1982. Research design and statistics for applied linguistics. Rowley, Mass.: Newbury House. Kinnear, P. R., & C. D. Gray. 2000. SPSS for windows made simple release 10. Hove (UK): Psychology Press. Seliger, H.W., & E. Shohamy. 1989. Second language research methods. Oxford: Oxford University Press. Winer, B. J., D. R. Brown, & K.M. Michels. 1991. Statistical principles in experimental design. New York: McGraw-Hill. Research Methods in Language Issues …………………………………… Appendices 127 128 Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Appendix 1 Proportions of area under the normal curve Area Area * Z Area between beyond * between mean and Z Z * mean and Z * 0.30 .0000 .5000 .1179 * 0.31 .0040 .4960 .1217 * 0.32 .0080 .4920 .1255 * 0.33 .0120 .4880 .1293 * 0.34 .0160 .4840 .1331 * 0.35 .0199 .4801 .1368 * 0.36 .0239 .4761 .1406 * 0.37 .0279 .4721 .1443 * 0.38 .0319 .4681 .1480 * 0.39 .0359 .4641 .1517 Area beyond Z .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247 * * * * * * * * * * 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121 0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3636 .3897 .3859 * * * * * * * * * * 0.50 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776 Research Methods in Language Issues …………………………………… Z 0.60 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 Area between mean and Z .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549 Area beyond Z .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451 * * * * * * * * * * * * * 0.70 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148 0.80 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867 Z 129 0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 Area between mean and Z .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389 Area beyond Z .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611 * * * * * * * * * * 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379 * * * * * * * * * * 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170 130 Z 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29 Area between mean and Z .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015 Area beyond Z .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985 * * * * * * * * * * * * * 1.30 1.31 1.32 1.33 1.34 1.35 1.36 1.37 1.38 1.39 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823 1.40 1.41 1.42 1.43 1.44 1.45 1.46 1.47 1.48 1.49 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681 Z 1.50 1.51 1.52 1.53 1.54 1.55 1.56 1.57 1.58 1.59 Area between mean and Z .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441 Area beyond Z .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559 * * * * * * * * * * 1.60 1.61 1.62 1.63 1.64 1.65 1.66 1.67 1.68 1.69 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455 * * * * * * * * * * 1.70 1.71 1.72 1.73 1.74 1.75 1.76 1.77 1.78 1.79 .4554 .4564 .4573 .4582 .4591 4599 .4608 .4616 .4625 .4633 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367 Research Methods in Language Issues …………………………………… Z 1.80 1.81 1.82 1.83 1.84 1.85 1.86 1.87 1.88 1.89 Area between mean and Z .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706 Area beyond Z .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294 * * * * * * * * * * * * * 1.90 1.91 1.92 1.93 1.94 1.95 1.96 1.97 1.98 1.99 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183 Z 131 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 Area between mean and Z .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857 Area beyond Z .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143 * * * * * * * * * * 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27 2.28 2.29 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110 * * * * * * * * * * 2.30 2.31 2.32 2.33 2.34 2.35 2.36 2.37 2.38 2.39 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084 132 Z 2.40 2.41 2.42 2.43 2.44 2.45 2.46 2.47 2.48 2.49 Area between mean and Z .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936 Area beyond Z .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064 * * * * * * * * * * * * * 2.50 2.51 2.52 2.53 2.54 2.55 2.56 2.57 2.58 2.59 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048 2.60 2.61 2.62 2.63 2.64 2.65 2.66 2.67 2.68 2.69 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036 Z 2.70 2.71 2.72 2.73 2.74 2.75 2.76 2.77 2.78 2.79 Area between mean and Z .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974 Area beyond Z .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026 * * * * * * * * * * 2.80 2.81 2.82 2.83 2.84 2.85 2.86 2.87 2.88 2.89 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019 * * * * * * * * * * 2.90 2.91 2.92 2.93 2.94 2.95 2.96 2.97 2.98 2.99 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014 Research Methods in Language Issues …………………………………… Z 3.00 3.01 3.02 3.03 3.04 3.05 3.06 3.07 3.08 3.09 Area between mean and Z .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990 Area beyond Z .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010 * * * * * * * * * * * * * 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 .4990 .4991 .4991 .4991 .4992 .4992 .4992 .4992 .4993 .4993 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007 * * * * * * * * * * Z 133 3.20 3.21 3.22 3.23 3.24 3.25 3.30 3.35 3.40 3.45 Area between mean and Z .4993 .4993 .4994 .4994 .4994 .4994 .4995 .4996 .4997 .4997 Area beyond Z .0007 .0007 .0006 .0006 .0006 .0006 .0005 .0004 .0003 .0003 3.50 3.60 3.70 3.80 3.90 4.00 .4998 .4998 .4999 .4999 .49995 .49997 .0002 .0002 .0001 .0001 .00005 .00005 Taken from Farhady, H. 1995. Research methods in applied linguistics. Tehran: Payame Noor University Press. 134 Degree of freedom 1 2 3 4 5 6 7 8 9 10 .20 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 Appendix 2 Critical values of t Level of significance for one-tailed test .05 .025 .01 .005 Level of significance for two-tailed test .10 .05 .02 .01 6.314 12.706 31.821 63.657 2.920 4.303 6.965 9.925 2.353 3.182 4.541 5.841 2.132 2.776 3.747 4.604 2.015 2.571 3.365 4.032 1.943 2.447 3.143 3.707 1.895 2.365 2.998 3.499 1.860 2.306 2.896 3.355 1.833 2.262 2.821 3.250 1.812 2.228 2.764 3.169 11 12 13 14 15 16 17 18 19 20 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.718 2.681 2.650 2.624 2.602 2.583 2.567 2.552 2.539 2.528 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 4.437 4.318 4.221 4.140 4.073 4.015 3.965 3.922 3.883 3.850 21 22 23 24 25 26 27 28 29 30 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.457 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 3.819 3.792 3.767 3.745 3.725 3.707 3.690 3.674 3.659 3.646 .10 .0005 .001 636.619 31.598 12.941 8.610 6.859 5.959 5.405 5.041 4.781 4.587 40 1.303 1.684 2.021 2.423 2.704 3.551 60 1.296 1.671 2.000 2.390 2.660 3.460 120 1.289 1.658 1.980 2.358 2.617 3.373 +120 1.282 1.645 1.960 2.326 2.576 3.291 Taken from Hatch, E. & H. Farhady. 1982. Research design and statistics for applied linguistics. Rowley, Mass.: Newbury House.