Download Correlation - alwakrassoteam

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Data mining wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Teacher’s Manual
Developing Young Researchers:
A Course on Teaching the Fundamentals
Of Research to Upper Primary – Secondary Students
TEACHER'S MANUAL
Part III: Data Analysis
Lesson 11: Analyzing Quantitative Data
Developed by the Student Research Committee
Dr. Mary Kellett (Open University)
Dr. Gene Jongsma (Education Institute)
Ms. Amina Amir Hamza (Al Bayan)
Dr. Hala Fathy (Al-Ieman)
Ms. Maria Prasad (Qatar Preparatory)
Ms. Mona Al-Boanain (Umm Al-Qura)
Ms. Sumaia Kassab (Al-Ieman)
Ms. Tarfa Nasser al-Naimi (Education Institute)
Lesson 11: Analyzing Quantitative Data
Page 1 of 25
Teacher’s Manual
Lesson 11
Analyzing Quantitative Data
Lesson Overview
Duration: 1 Block + Optional activities on Data Displays
Lesson Objectives:
At the end of this lesson, students should be able to:



Distinguish between qualitative and quantitative data
Define quantitative data
Develop strategies for handling large quantities of raw data
Core Knowledge:







Types of data
o Qualitative data (descriptive)
o Quantitative data (discrete, continuous)
Measures of central tendency
o Mean, median, mode
Measures of variability
o Variance, standard deviation
Analyzing comparative data
Statistical significance
Correlation
o Direction of the correlation (positive, negative)
o Strength of the correlation (low to high)
Tabulating and representing data (Optional)
o Frequency tables
o Scatter diagrams
o Line graphs, bar graphs, circle graphs
Skills: critical reading, critical thinking, analyzing.
Teaching Strategies: questioning, modelling, discussing, group work.
Follow-up Activities: Activity Sheets #4-8 give follow-up practice in making and
analyzing charts and graphs.
Curriculum Links: Maths – data representations, ICT, science
Lesson 11: Analyzing Quantitative Data
Page 2 of 25
Teacher’s Manual
Key Terms
qualitative data
quantitative data
statistical significance
median
standard deviation
discrete data
mean
correlation
continuous data
mode
variance
Warm-up Activity
Activity Sheet #1
Classifying


Give the students a lot of wooden cubes with different colours and sizes and
have them divide the cubes into groups.
1- One group differentiate them according to colours.
2- Other group differentiate them according to sizes.
3- Third group interprets the numbers of each colour.
4- Fourth group interprets the number of each size.
Ask students to record the data in Activity Sheet #1
Core Knowledge
Types of Data

Data can be qualitative or quantitative
 Qualitative data are descriptive
1. Qualitative data do not use numbers. They are completely descriptive.
2. Qualitative data are often subjective (depending on people's opinions). So,
qualitative data may be less objective than quantitative data.
Examples: eye colours, hair colours,
 Quantitative data measure quantities:
1. Quantitative data are anything that you can measure in numbers
2. Quantitative data tend to be easier to analyze than qualitative data.
Examples: Heights, weights, test scores, time to complete homework
Lesson 11: Analyzing Quantitative Data
Page 3 of 25
Teacher’s Manual
Activity Sheet #2
Quantitative or Qualitative


Use Activity Sheet #2 to reinforce the distinction between qualitative and
quantitative data.
Suggested answers are provided for the teacher. Please understand that the
suggested answers for the open-ended questions are suggestions. Students may
think of other valid answers that are different.
Types of Quantitative Data

Quantitative data can be discrete or continuous

Discrete data can be measured exactly.

Data are discrete if they represent something countable
Examples:
Number of people going to a movie at the cinema
The scores in a football match

Continuous data cannot be measured exactly.

Continuous data can be measured on a continuum or scale.

Continuous data can have almost any numeric value and can be meaningfully
subdivided into finer and finer increments, depending upon the precision of
the measurement system.

In mathematics, it can take any of an infinite number of values between whole
numbers and so may not be measured completely accurately.
Examples:
height, weight, temperature, the amount of sugar in an orange,
time required to run a mile, an infant's birth weight in grams
See Activity Sheet #3
Discrete or Continuous


Use Activity Sheet #3 to practice applying the distinction between discrete and
continuous data.
Suggested answers are provided for the teacher. Please understand that the
suggested answers for the open-ended questions are suggestions. Students may
think of other valid answers that are different.
Lesson 11: Analyzing Quantitative Data
Page 4 of 25
Teacher’s Manual
Measures of “Central Tendency” or Averages

One of the most common types of quantitative data analysis is to calculate the
average. The average is a single number that represents the performance or
attributes of a group.

The three most common measures of central tendency are the mean, median,
and mode.
1. The mean is the arithmetic average.

To find the mean, add together all data values and divide by the total
number of values in the sample.
Example: Number of people living in each of nine three bed-room houses:
3
5
1 3
7 5
5
5
2
Mean= 3+5+1+3+7+5+5+5+2 = 36/9=4

The mean changes if you add or remove a data value from the sample
(unless it's equal to the mean itself).

The median is the middle value.

To find the median, put the data in ascending order, and then find the
middle value.

It's easy to find if you have an odd number of values.

If there is an even number of values, the median is halfway between
the two middle values.
Example: What is the median for 5
8
10
12?
The median comes halfway between 8 and 10.
So the median = 9.

The mode is the value that occurs most often.
Example: What is the mode for the following data set?
3 5 1 3
7 5 5 5
Lesson 11: Analyzing Quantitative Data
5 2
(The mode is 5.)
Page 5 of 25
Teacher’s Manual
Measures of Variability
Another important type of quantitative analysis is to look at how measurements
"spread out." For example, if a research group took a test, did everyone get scores
that are bunched up near the middle or average of the group, or did some people get
really high or really low scores, making the distribution more spread out? We call
this "spreading out" of scores variability.
There are two common measures of variability – variance and standard deviation.
There are formulas for calculating these statistics. Computer programs can be used to
do the calculations quickly and easily.
Standard deviation can be a useful statistic because it is related to the normal curve. If
we know the mean of group of scores and the standard deviation, we can computer the
percentile score of everyone in the group. An illustration of a normal distribution is
below.
Analyzing Comparative Data
Researchers often want to compare measurements taken on one group with
measurements taken on another group. For example, suppose we were testing a new
fertilizer. One group of plants (experimental plants) received the fertilizer; the second
group (control plants) did not. After two weeks, the height of all of the plants was
measured. The mean (average) height of the experimental plants was 65 centimetres
and the mean (average) height of the control plants was 60 centimetres.
Since 65 is greater than 60, does that mean the new fertilizer really works? Not
exactly. The difference of 5 centimetres may have been due to chance. So, to really
see if the difference is due to the fertilizer, the research must conduct a mathematics
operation to test for statistical significance.
Lesson 11: Analyzing Quantitative Data
Page 6 of 25
Teacher’s Manual
Statistical Significance
Mathematicians have devised a procedure to determine if the difference is "real." The
procedure is called statistical significance and it is based on the laws of probability.
Statistical significance determines if the differences between two numbers is bigger or
smaller than the differences that might be expected to occur by chance. We are not
going to learn the mathematical formula in this course. There are computer programs
that can calculate this quickly for us. We just want to understand the concept.
Generally speaking, the probability of something happening by chance is taken as
being no greater than 5 in a hundred times. We write that this way: p = 0.05. When
you conduct a test of statistical significance, the p value for your data is compared to
statistical inference tables. If your p value is smaller that the one in the table (p<.05)
then the difference between your two measurements is probably not due to chance.
But if your p value is greater than the one in the table (p>.05), then the difference is
probably due to chance.
Correlation
Correlation is a statistical technique which shows how two things are related. For
example, height and weight are related. Taller people tend to be heavier than shorter
people. The relationship isn't perfect. People of the same height vary in weight, and
you can easily think of two people you know where the shorter one is heavier than the
taller one. Nonetheless, the average weight of people 1 meter tall is less than the
average weight of people 2 meters tall, and their average weight is less than that of
people 3 meters, etc. Correlation can tell you just how much of the variation in
peoples' weights is related to their heights.
Direction of the Correlation

There are two directions of correlation. In other words, there are two
patterns that correlations can follow. These are called positive correlation
and negative correlation.
In a positive correlation, as the values of one of the variables increase, the
values of the second variable also increase. Likewise, as the value of one
of the variables decreases, the value of the other variable also decreases.
Examples:
correlation between weight and height
Negative correlation
In a negative correlation, as the values of one of the variables increase, the values
of the second variable decrease. Likewise, as the value of one of the variables
decreases, the value of the other variable increases.
This is still a correlation. It is like an “inverse” correlation. The word “negative” is
a label that shows the direction of the correlation.
Lesson 11: Analyzing Quantitative Data
Page 7 of 25
Teacher’s Manual
There is a negative correlation between TV viewing and class grades—students
who spend more time watching TV tend to have lower grades (or phrased as
students with higher grades tend to spend less time watching TV).
Here are some other examples of negative correlations:
1. Education and years in jail—people who have more years of education tend to
have fewer years in jail (or phrased as people with more years in jail tend to have
fewer years of education)
2. Crying and being held—among babies, those who are held more tend to cry less
and babies who are held less tend to cry more.
Seeing is believing! The following Website offers a neat interactive program for
exploring correlations.
http://www.ba.infn.it/~zito/museo/esp148/cor7.html
Strength of the Correlation
Correlations, whether positive or negative, range in their strength from weak to
strong.
Positive correlations will be reported as a number between 0 and 1. A score of 0
means that there is no correlation (the weakest measure). A score of 1 is a perfect
positive correlation, which does not really happen in the “real world.” As the
correlation score gets closer to 1, it is getting stronger. So, a correlation of .8 is
stronger than .6; but .6 is stronger than .3.
Negative correlations are between 0 and -1. Again, a 0 means no correlation at all.
A score of –1 is a perfect negative correlation, which does not really happen. As
the correlation score gets close to -1, it is getting stronger. So, a correlation of -.7
is stronger than -.5; but -.5 is stronger than -.2.
Remember that the negative sign does not indicate anything about strength. It is a
symbol to tell you that the correlation is negative in direction. When judging the
strength of a correlation, just look at the number and ignore the sign.
Advantages of Correlations
An advantage of the correlation method is that we can make predictions about things
when we know about correlations. If two variables are correlated, we can predict one
based on the other. For example, we know that SAT scores and college achievement
are positively correlated. So when college admission officials want to predict who is
likely to succeed at their schools, they will choose students with high SAT scores.
Lesson 11: Analyzing Quantitative Data
Page 8 of 25
Teacher’s Manual
Disadvantages of Correlations
The problem that most students have with the correlation method is remembering that
correlation does not measure cause. For example, we know that education and income
are positively correlated. We do not know if one caused the other. It might be that
having more education causes a person to earn a higher income. It might be that
having a higher income allows a person to go to school more. It might also be due to
some third variable.
A correlation tells us that the two variables are related, but we cannot say anything
about whether one caused the other. This method does not allow us to come to any
conclusions about cause and effect.
Assessment
The assessment for this lesson asks students to apply the concepts learned to
hypothetical situations.
Glossary
qualitative data – data based on descriptive or non-quantitative features such as eye
colour or hair colour, nationalities of people, and religions
quantitative data – data based on quantitative features such as scores on a test,
temperature, or weight
discrete data – data that are counted or measured separately such as the number of
students in an class or the number of buses in the parking lot
continuous data – data that are measured on a continuous scale such as temperature
and weight
statistical significance – a mathematical procedure for determining if the difference
between two scores occurred by chance or not
mean – the arithmetic average; a measure of central tendency
mode – the score that appears most frequently in a group of scores: a measure of
central tendency
median – the score that is in the middle of a group of scores; a measure of central
tendency
variance – a measure of variability or the distribution of scores in a group
standard deviation -- a measure of variability related to the normal curve
correlation – a statistical procedure that shows how two things are related
Lesson 11: Analyzing Quantitative Data
Page 9 of 25
Teacher’s Manual
Tabulating and Presenting Data
(Optional Lesson and Activities)
Several types of statistical/data presentation tools exist, including: (a) charts
displaying frequencies (bar graphs, and pie graphs); (b) charts displaying trends (line
graphs; run charts), (c) charts displaying distributions (histograms), and (d) charts
displaying associations (scatter diagrams).
Graphs and charts are good to use for a couple of reasons. They communicate a lot of
information in a small space. They also present data visually making it easier for
many people to understand. For this reason, graphs and charts are often used in
newspapers, magazines, and reports.
Sometimes, complicated information is difficult to understand and needs an
illustration. Graphs or charts can help impress people by getting your point across
quickly and visually.
Data Display Tools
To Show
Examples
Data Needed
Frequency of occurrence:
Simple percentages or
comparisons of magnitude
Bar chart
Pie chart
Tallies by category
(data can be attribute
data or variable data
divided into
categories)
Trends over time
Line graph
Run chart
Measurements taken
in chronological
order (attribute or
variable data can be
used)
Distribution: Variation not
related to time (distributions)
Histograms
Forty or more
measurements (not
necessarily in
chronological order,
variable data)
Association: Looking for a
correlation between two
things
Scatter diagram
Forty or more paired
measurements
(measures of both
things of interest,
variable data)
(as above examples
of correlations)
Lesson 11: Analyzing Quantitative Data
Page 10 of 25
Teacher’s Manual
Run Chart
(According to time)
Line Graph of Average Daily Temperature
(Plotted by day)
Lesson 11: Analyzing Quantitative Data
Page 11 of 25
Teacher’s Manual
Tally Charts and Frequency Tables
Frequency tables let you see lots of raw data more easily. They can show the
frequency (how many) in each group.
Example
The marks below were scored by the children in a class on their maths test. All the
marks are out of ten. To organize these data, a tally chart has been produced
Tally Chart
Raw Data
We see that 3 columns (or can be done as rows instead).
1. The marks from a low of 2 to a high of 10
2. A tally is made for each person getting that mark.
3. The frequency column is the number of tally marks.
Grouped frequency table: When you have lots of data, you can group them and make
a grouped frequency table as the following example:

Be careful with the continuous data, any possible value must find a group
to go into.
Lesson 11: Analyzing Quantitative Data
Page 12 of 25
Teacher’s Manual
Using Tally Charts and Frequency Tables to Analyze Data
You can calculate percentages for each column or row to show the percentage of
each subgroup from the total This way you can compare groups which are not
similar in numbers.
Example
Suppose that in thirty shots at a target, a marksman makes the following scores:
522344320303215
131552400454455
The frequencies of the different scores can be summarized as follows:
Score
0
1
2
3
4
5
Tally Frequency Frequency (%)
////
4
13%
///
3
10%
////
5
17%
////
5
17%
//// /
6
20%
//// //
7
23%
Using Pie and Bar Graphs
Bar and pie charts use pictures to compare the sizes, amounts, quantities, or
proportions of various items or groupings of items.
When to Use Them
Bar and pie charts can be used in defining or choosing problems to work on,
analyzing problems, verifying causes, or judging solutions. They make it easier to
understand data because they present the data as a picture, highlighting the results.
Simple bar charts sort data into simple categories.
Grouped bar charts divide data into groups within each category and show
comparisons between individual groups as well as between categories. It gives more
useful information than a simple total of all the components.
Stacked bar charts, which, like grouped bar charts, use grouped data within
categories. They make clear both the sum of the parts and each group’s contribution
to that total.
Lesson 11: Analyzing Quantitative Data
Page 13 of 25
Teacher’s Manual
Sample Bar Charts
Sample Pie Chart
Making a pie chart




Take the data to be charted and calculate the percentage for each category.
First, total all the values.
Next, divide the value of each category by the total.
Then, multiply the product by 100 to create a percentage for each value.
Lesson 11: Analyzing Quantitative Data
Page 14 of 25
Teacher’s Manual
Important points to remember when using bar and pie charts:
1. Be careful not to use too many notations on the charts. Keep them as simple as
possible and include only the information necessary to interpret the chart.
2. Do not draw conclusions not justified by the data. For example, determining
whether a trend exists may require more statistical tests and probably cannot
be determined by the chart alone. Differences among groups also may require
more statistical testing to determine if they are significant.
3. Whenever possible, use bar or pie charts to support data interpretation. Do not
assume that results or points are so clear and obvious that a chart is not needed
for clarity.
4. A chart must not lie or mislead! To ensure that this does not happen, follow
these guidelines:
5. Scales must be in regular intervals
6. Charts that are to be compared must have the same scale and symbols
7. Charts should be easy to read
Lesson 11: Analyzing Quantitative Data
Page 15 of 25
Teacher’s Manual
Activity Sheet #1
Classifying
Name:
Class:
Date:
Classify the wooden cubes:
1-
Group 1: Sort them by colours
2-
4-
3-
5-
6-
Group 2: Sort them by size
12-
3-
4-
6-
5-
Group 3: Find the number of each colour
Colour
Red
Black
Yellow
Green
White
No.
Group 4: Find the number of each size
Size
Small
medium
Large
No.
Lesson 11: Analyzing Quantitative Data
Page 16 of 25
Teacher’s Manual
Activity Sheet #2
Quantitative or Qualitative?
Directions: Classify each type of data listed below as quantitative or qualitative.
Then list two more examples of each kind of data.
Type of Data
1-
Number of students in each class.
2-
Time taken to finish your homework.
3-
The colours of balloons with children in the garden
4-
The height of the students in your class.
5-
The nationalities of peoples in Qatar.
6-
The numbers of times each student in your class are
late each week.
Quantitative
Qualitative
*
*
*
*
*
*
List two more examples of quantitative data:
1. The number of cars registered in Doha
2. The average daily temperature in Abu Dhabi.
List two more examples of qualitative data:
1. The names of animals that are on the “nearly extinct” list
2. The nationalities of students attending the Independent Schools
Lesson 11: Analyzing Quantitative Data
Page 17 of 25
Teacher’s Manual
Activity Sheet #3
Discrete or Continuous?
Directions: Classify each type of data listed below as discrete or continuous.
Then list two more examples of each kind of data.
Type of Data
Discrete
Continuous
1-
The weight of a new born killer whale
2-
The number of students in your English class
3-
The number of books you read last year
4-
The temperature of the water in the Gulf on the first
day of each month
*
5-
The amount of rain that fell in Doha in 2007
*
6-
The total amount (in Riyals) of sales of ice cream at a
Baskin Robbins shop
*
*
*
*
List two more examples of discrete data:
1. The number of countries you have visited.
2. The number of English teachers at your school.
List two more examples of continuous data:
1. The weight of the produce you buy in the market.
2. The amount of water you drink yesterday.
Lesson 11: Analyzing Quantitative Data
Page 18 of 25
Teacher’s Manual
Lesson 11: Analyzing Quantitative Date
Assessment
Huda is doing a research study to find out if more people climb the stairs or take the
lift in an office building in Doha. She is observing for one hour each day for a week.
When a person enters the lobby of the building, she ticks whether they took the stairs
or the lift.
1. Is Huda collecting qualitative or quantitative data?
Quantitative data
2. Is her data discrete or continuous?
Discrete data
With the help of the PE teacher at his school, Ali is doing an experiment to see what
effect a daily exercise class on boys' strength. One class goes to a 15-minute exercise
class each day. Another class does not receive any exercise. After one month, the
boys in both classes are tested on the number of push-ups they can do.
3. Make a tally chart that Ali could use to summarize his data.
Answers will vary
4. What would be a good measure of central tendency that Ali could use?
Mean or median is probably best; sample is not large enough to make the mode a
good measure
5. After he summarized his data, Ali found the average number of push-ups done by
the boys in the exercise class was 12 compared to 10 for the boys who did not
attend the exercise class.
Does this prove that the exercise class makes boys stronger?
No
What does Ali need to do to test his hypothesis?
Test the statistical difference between the means of the two groups
Lesson 11: Analyzing Quantitative Data
Page 19 of 25
Teacher’s Manual
Activity Sheet #4
Making a Line Graph
Name:
Date:
Class:
The data below lists three students' math quiz scores for nine weeks. Make a tripleline graph from the data.
a) What trends can you notice?
b) Which quiz was seemingly easier than the others?
c) Describe Marlene's performance on the quizzes.
Lesson 11: Analyzing Quantitative Data
Page 20 of 25
Teacher’s Manual
Activity Sheet #5
Making a Bar Graph
Name:
Date:
Class:
Sara asked the people in her class how many hours per day they watched TV.
The results are below; she already organized them in order.
First, use the data to complete the frequency table.
Second, make a bar graph.
Third, find the average for the class.
001111111111122223333444556
Work Sheet
Lesson 11: Analyzing Quantitative Data
Page 21 of 25
Teacher’s Manual
Activity Sheet #6
Making a Bar Graph
Name:
Date:
Class:
Sara also asked the same class about their favorite color. Their responses have
already been grouped into the frequency table. Make a bar graph.
Can you find the average?
_______________
Lesson 11: Analyzing Quantitative Data
Page 22 of 25
Teacher’s Manual
Activity Sheet #7
Analyzing Line Graphs
Name:
Date:
Class:
Answer the questions.
a. What is the coldest month
in Buenos Aires?
b. Why is it not January
or February?
c. What are the warmest months
in Buenos Aires?
d. Does Buenos Aires get snow?
e. What are the coldest months
in Irkutsk?
f. What are the warmest months?
g. Does Irkutsk get snow?
h. Is the warmest weather in Buenos
Aires warmer or colder than
summer in your area?
How about Irkutsk?
i. Where are Buenos Aires
and Irkutsk located?
Check from a map.
Lesson 11: Analyzing Quantitative Data
Page 23 of 25
Teacher’s Manual
Activity Sheet #8
Analyzing Circle Graphs
Name:
Date:
Class:
1. Write the percentages into the right circle sectors.
2. 2. Find 5%, 10% and 20% using mental math.
3. Discount time is always fun!
Lesson 11: Analyzing Quantitative Data
Page 24 of 25
Teacher’s Manual
4. Match the percentages with the right circle sectors. Find how many of each
drink was sold.
Math Mammoth Statistics Worksheets Collection. Copyright SpiderSmart, Inc. and Taina Maria Miller
www.MathMammoth.com
Lesson 11: Analyzing Quantitative Data
Page 25 of 25