Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
922 - Statistics exercise – Shira Farby The exercise below has two parts. Part I (sections B & C) should be submitted by 19/12/11 Part II (sections D-G) should be submitted by 9/1/12 (after a short intro to excel on the 26/12) Late submissions will NOT be accepted A. Experiment description A linguistic experiment tested 58 children who are Russian-Hebrew bilinguals (speak Russian at home, and Hebrew at the preschool). In a background information form the age of the children in months was marked, as well as how long they have been exposed to Hebrew. Two questions in the background information form were to what extent the child's mother uses Russian (more / less) and to what extent the child's mother uses Hebrew (more/less). As theses are separate questions, both can be answered as "more". One part of the experiment was on social identity, and the children were asked to rate (on a scale of 1 to 10) to what degree they agree with the following social labels: Russian, Israeli, Ivri, Jewish. The second part of the experiment was linguistic, and the children were asked to repeat sentences in Hebrew and Russian. The score of two linguistic indicators from the repeated sentences was computed separately (repeating prepositions correctly, repeating plural inflection correctly). Part I – for 19/12/11 B. Design 1. Name 3 independent variables. What is the type of their scale? (exhaustive list) Variable Scale Age Ratio Length of exposure to Hebrew Ratio Mother in terms of Russian use Nominal Mother in terms of Hebrew use Nominal 2. Name 3 dependent variables. What is the type of their scale? (exhaustive list) Variable Scale How much do you agree Russian Ordinal How much do you agree Israeli Ordinal How much do you agree Ivri Ordinal How much do you agree Jewish Ordinal Repeating prepositions correctly- Hebrew Ratio Repeating plural inflection correctly- Hebrew Ratio Repeating prepositions correctly- Russian Ratio Repeating plural inflection correctly- Russian Ratio 3. How many conditions are in the social part? Give them names. Four conditions: Russian identity, Israeli identity, Ivri identity and Jewish identity. 4. How many conditions are in the linguistic part? Give them names. There are four conditions, arranged in a 2x2 design. The conditions are (1) the language tested (Russian/Hebrew) and (2) the linguistic part repeated (preposition / plural inflection). The four conditions are: Preposition repetition in Hebrew, Preposition repetition in Russian, Plural inflection repetition in Hebrew, Plural inflection repetition in Russian. C. Hypothesis 1. Choose one part of the experiment (social/linguistic), and phrase for it an experimental hypothesis. Remember that the hypothesis predicts an effect of independent variable(s) on dependent variable(s). There are many possible experimental hypotheses; only two are given here as examples. Please use your own hypothesis for part II. Social The more the mother uses Russian, the child will rate the social label “Israeli” lower than “Russian” Linguistic The longer the time a child is exposed to Hebrew, s/he will score higher on the Hebrew tasks (correct repetition of prepositions/plural inflection). 2. What is the null hypothesis for your hypothesis? Social The mother’s use of Russian (more/less) does not affect the child’s rating of the social labels “Israeli” and “Russian” Linguistic The length of exposure to Hebrew does not affect the score of Hebrew tasks (correct repetition of prepositions/plural inflection). 3. What is the direction of your hypothesis? Social The direction of this hypothesis is negative. In order to reject the null hypothesis, the results for the social label “Israeli” will have to be at the lowest 5% of the distribution of children whose mother uses more Russian. Linguistic The direction of this hypothesis is positive. In order to reject the null hypothesis, the results of the Hebrew tasks (correct repetition of prepositions/plural inflections) will have to be at the highest 5% of the distribution of children with long exposure to Hebrew. Part II – for 9/1/12 D. General computations 1. For the variable "age" compute the mean and median. The mean of age is 69.67 months. The median of age is 70 months. 2. For the variable "Length of exposure to Hebrew" compute the mean and standard deviation. The mean is 3.75 years. The standard deviation is 1.37 years. 3. For the variable "Mother in terms of Russian use" compute the percent of "more" responses. The percent of “more” responses is 84.48% 4. For the variable "Mother in terms of Hebrew use" compute the ratio of "more" responses. The ratio between “more” and “less” responses is 39/19 5. For the variable "How much do you agree: Ivri" compute minimum, maximum and range. The minimum is: 1 The maximum is: 10 The range is 1 to 10. 6. For the variable "How much do you agree: Israeli" construct a frequency table. Rating 1 2 3 4 5 6 7 8 9 10 Frequency 15 4 1 3 3 2 1 1 0 28 7. For the variable "preposition repetition, Hebrew" compute the proportion of correct repetition (for each child), and add this as a new column to the excel table. Proportion is written in real numbers, 28/30 and so on. Thinking question: what is the difference between the new column (proportion of prep repetition) and the old grade (prep repetition)? Answer: instead of N correct responses, we have a grade (which can be comparable to other grades). 8. For the variable "preposition repetition, Russian" construct a column with percent of correct responses for each child, and fill in the following table: % of correct Frequency cumulative frequency percentile 40-70 6 6 10.34% 71-80 7 13 22.41% 81-90 6 19 32.76% 91-95 17 36 62.07% 96-100 22 58 100.00% Total N 58 preposition (Russian) Optional: generate a frequency graph for this variable. E. Descriptive statistics 1. Social hypothesis In a 2x2 design of 4 conditions, the effect of the mother's use of Russian (more/less) as the independent variable on the social identity (Israeli/Russian) as the dependent variable(s) was measured in a rating task, on a scale of 1-10. (N=58) Linguistic Hypothesis In a 2x2 design of 4 conditions, the effect of the length of the child's exposure to Hebrew (more/less) as the independent variable on the correct repetition of Hebrew functional categories (preposition/plural inflection) as the dependent variable(s) was measured in a sentence repetition task. (N=58) 2. Construct the appropriate table to report mean and standard deviation of all conditions. Social label Mother's use Mean Std of Russian Israeli Russian More 6.31 3.97 Less 5.67 4.39 More 6.73 3.79 Less 5.0 4.52 For the linguistic task, the variable "length of exposure" was used to divide the children to two groups, based on its mean. The groups are: "less exposed" (range:0.5-3.17, N=18) and "more exposed" (range:3.25-5.75, N=40). In addition, the linguistic grades were computed as percents for the repetition tasks. Repetition, Length of mean Std Hebrew exposure Preposition More 0.89 0.12 Less 0.89 0.086 Plural More 0.86 0.12 inflection Less 0.88 0.086 3. Optional: generate a histogram of the findings. F. Initial Conclusions 1. Do the findings of this group suggest that your hypothesis is in the right direction? Social: Looking at the "more" group, the mean for "Israeli" is slightly lower (6.31) than the mean for Russian (6.73). Thus, the raw means support the hypothesis. Linguistic: For both categories (preposition/plural inflection) the "more" group was expected to have higher means than the "less" group. This is not supported. For preposition repetition, both groups perform equally well (0.89); for plural inflection, the "more" group actually performed slightly lower (0.86) than the "less" group (0.88). 2. Do the findings support the effect of the independent variable(s)? Social: The choice of "Mother's use of Russian" as the independent variable shows an effect on social identity; for both labels (Israeli/Russian) the "less" group has lower means (5.33) than the "more" group (6.52). Linguistic: The choice of the mean of "length of exposure" (3.17) for dividing the children into two groups (more/less) does not seem to have an effect on the performance in these tasks. G. Inferential statistics 1. Name the statistical test you will use to see if the effect is significant. Social: there is one independent variable (one-way test), but two measures from each child (repeated). The grades are ordinal, hence, a non-parametric test is required: Friedman’s ANOVA (which is parallel to a one-way ANOVA). Linguistic: there is one independent variable (one-way test), but two measures from each child (repeated). The grades are computed as percents, and hence are on a ratio scale. The test is a one-way repeated ANOVA. 2. How many degrees of freedom do you have? In general, for a t-test we will have df = N-1 =57. For ANOVA we have an F score, which requires us to compute df within groups, as well as df between groups. For both hypotheses, df within = k-1=1, and df between = N-2 = 56. (Since we have two groups we lose a degree of freedom for each group). 3. How will you compute the critical value for rejecting the null hypothesis? ( or /2). Since both hypothesis have a direction the critical value will be computed using .