Download CML 2 - Nicole De Langen

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Inductive probability wikipedia , lookup

Statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Nicole Delangen
CML 2
Statistics and Probability
Nicole De Langen
CONTENTS
Data
The Box and whisker plot :
Statistic
Central tendency:
Continuous Data-
Probability Distributions
Misrepresent-
Lattice diagram:
Column Graph-
Venn diagram:
Bar graph-
Tree diagram:
Pie Graph-
Frequency table :
Scatter plot-
Probability:
Histogram-
Characteristics of shape
Five summaries Quartile
Test and reflections
1|Page
YEAR 10 LEARNING OUTCOMES
CML 2 – Probability and statistics: I think that this CML may be easier to do, because there isn’t many formulas (I
think) that is used in statics and probability. It would good to understand most of this CML because it will help in
other subjects when asked to explain a graphed information, or even work out certain data points when writing a
practical report. This time I would like to improve on the reflective side of the journal and including a better
description on the modelling task.
Data
What is smarter a cat or dog?
Cats
Dogs
And this is the amount of people that
chose cats
- A collection of facts, such as values or measurements
Results
10
20
Then this is the amount of people that
choice Dogs
The Information from the question (Cat and Dog)
Statistic
- A numerical fact or datum, especially one computer from a sample, How to collect, summarize and present it.
Day
1
2
3
4
Height
6.5
6.6
6.8
6.7
The statistic lays out the
results or data into a
readable format. Giving an
easier
Average Height= 6.65
Minimum Height= 6.5
Maximum Height= 6.8
Quantitative Data- Counted (called discrete data) or Measured (called continuous data)
Data
Qualitative
Qualitative
Discrete
Continuous
Discrete data- Data that can only take certain values, For example: the number of students in a class (you can’t
have half a student). It also means to round up or down the number. Instead of 4.7 of a person you would round it
up to 5.
5
2|Page
Continuous DataData that can take any value (within a range), For example: People’s heights could be any value (within the range of
human heights), not just certain fixed heights.
3–4
1-2
MisrepresentTo represent incorrectly, improperly, falsely or unsatisfactory
Age, gender
Male, 15
Female,15
Male, 16
Height
180cm
170cm
678cm
This is a misrepresentation because it is virtually impossible to have a 16 year old Male with a height of 6m and
78cm.
Column GraphA graph that shows data with a vertical bar.
This is an example of a column graph!
Number of students
School Pets
80
60
40
20
0
A column graph has 4 important components.
1.
2.
3.
4.
survey.
The heading / title of your experiment
The Number of students
The Data put into columns
Finally the name of the pets or the options given during a
Types of pets
3|Page
Bar graphThis is virtually the same as a column graph but the other way around. Instead of being vertical columns it is
horizontal bars.
Animals of choice
School Pets
1.
2.
3.
4.
Snakes
Fish
Mice
Birds
Dogs
Cats
0
10
20
30
40
50
60
70
Heading/title
Number of students
Data in bars
The names of the pets
Cats
Dogs
Birds
Mice
Fish
Snakes
Axis Title
DID YOU KNOW?
You can make any graph with ease using Microsoft Excel:
20
50
13
8
63
6
Enter in your results like this, Then highlight the table ,
Go into INSERT, CHARTS choose an appropriate graph and it should automatically show.
Pie Graph-
This pie graph has the same information
then
column
graph
but value.
it shows
A pie chart is a circular chart divided into sectors, each sector shows
thethe
relative
size
of each
different information. The pie graph is
better if the user is looking for the
percentage of the results.
School Pets
Each sections of the pie = the different
choices given
Snakes
4%
Cats
13%
For example :
Dogs = 50 = 31%
Fish
39%
Dogs
31%
Birds
8%
Mice
5%
So 50 people have dogs or 31% of the total
people surveyed have dogs.
4|Page
Scatter plotA graph of plotted points that show the relationship between two sets of data.
School 1
100
Number of students with Pets
90
Linear 1
y = -1.0286x + 30.267
R² = 0.0064
80
70
School 2
Age,
gender
Male, 15
50
Linear (Series1)
40
Linear (Series2)
180cm
Female,15 170cm
Male, 16
60
Height
678cm
30
Linear 2
y = -7.2571x + 49.067
R² = 0.1635
20
10
0
0
2
4
6
8
Linear – An equation that makes a straight line when it is graphed.
Often written in the form: y=mx+b
So in this instances Linear 1 y= 1.0286x+30.267 and Linear 2 y=-7.2571x+49.067
R2- the R2 is the correlation coefficient. The R2 is mainly used to determine how much scatter is on the graph. The R2
value gives a percentage form the formula.
R2 = Ʃxi2 x Ʃyi2 / Ʃ (xi2 x yi2)
To get this on the graphs made in excel you need to :
The scatter plot is good if you have two set
of information to compare.
This graph has the original school pets’
information as well as data from another
school.
1.
2.
3.
4.
5.
click on the chart
Go into layout
Select Trend line then choose the appropriate line
Double Click on the new formed line
Click display equation and display R2
The scatter plot allows for these schools to
be compared .
Scatter plots are also good for the linear
line or tread line
5|Page
HistogramThis graph is like a column graph but each group of numbers showing a range between them. A Histogram is a great
way to show results of continuous data like: Weight, height and time. The Histogram is also different because of the
way the columns are laid out
The History of the
Histogram
Average height of students
600 <-> 700
700
100 <-> 200
100 <-> 200
600
Height in (m)
The word ‘Histogram’ is of
Greek origin meaning a
form of writing in long
vertical shapes, But it
wasn’t used until 1859
when K, Pearson introduced
Population
and sample
it into
statistics
800
500
400
The columns are
joined together
300
200
100
Population: A group from
which the sample or data is
taken.
0
Male 1
Female 1
TEst students
Male 2
Sample: Gathering of data from a large group or “population” so it can be examined to find out something that
affects the whole group more accurately. E.g. If there was an outbreak how much of the sample group was infected.
Five summaries Quartile
Q1: The lowest number in the set
Q2: Lower Quartile- The lower quartile is the median of the lower half of the data. (E.g. In the data set below,
what is the upper quartile?)
2 2 3 3 3 4 4 5 6 8 9
Work out the median then the higher numbers to the right find out the middle number, 4 is the median overall, the higher
numbers median is 6
Q3: The median of the set of numbers.
Q4: Upper Quartile- The upper quartile is the median of the higher half of the data. (E.g. In the data set below,
what is the lower quartile?)
Work out the median then the higher numbers to the right find out the middle number, 4 is the median overall; the lower
number median is 3.
Q5: The highest number in the set of numbers
6|Page
The Box and whisker plot :
To draw a box and whisker plot you need your
numbers to be in ascending or descending order. Then
follow these steps (five quartile summary see page 4):
1.
2.
Find the median of the numbers
Then find the median for the lower numbers
3. And again find the median for the higher
numbers (the five quartile summary)
4. Now you have three points, joining with the
data work out the highest point and the lowest point ( this then gives us the five quartile summary)
5.
Draw a number line that matches your data
6.
2 2.5 3 3.5 …
Mark of the highest point, lowest point and the three data points from previous
7.
Box the three points
Quartile/ point 3
Quartile/ point 1
Quartile/ point 2
8.
And whisker the endpoint
To generate the box and whisker plot
online go to and follow the prompts:
The end product should look like this:
http://www.mathwarehouse.com/charts/b
ox-and-whisker-plot-maker.php
Central tendency:
Mean: Sort out the number in ascending order, add them together and divide the outcome by the total amount of
numbers overall in the line.
2 +2 +3 +3 +3 +4 +4 +5 +6 +8 +9= 196
196/11= 17.81
7|Page
Median: The median is the middle number; Sort out the numbers in ascending order. Count the number of digits in
the list. 2 2 3 3 3 4 4 5 6 8 9
(11+1)/2=12/2=6th number therefore the median is 4
Mode: The mode is the number that is repeated more than any other number. In this example three is most used
number.
2 2 3 3 3 4 4 5 6 8 9
Range: The range is the highset number from the lowest number.
2 2 3 3 3 4 4 5 6 8 9
9-2= 7
Probability Distributions
Lattice diagram:
A Lattice diagram is used when determining all the possible outcomes of a situation that involves two
sides. One set of outcome is written at the top of grid and the other down the side of the grid. The
combined outcome is then filled in the crossing grid boxes according to the consistent outer labels.
Main Meal
Chicken
Chicken, Fruit Salad
Chicken, Ice cream
Beef
Beef, Fruit Salad
Beef, Ice cream
Lamb
Lamb, Fruit Salad
Lamb, Ice cream
Fruit Salad
Ice cream
Desert
Venn diagram:
The Venn diagram was originally invented by John Venn. The diagram is a way to picture the connection
between different groups of things. The diagrams consist of two overlapping circles. E.g.
8|Page
For example:
Swan
Platypuses
Worms
Horses
This diagram reads really easy; the first circle in pink labelled small and fury has a number of small
animals that are fury. In the middle between the two circles in green is the animal that has both of
the two stated features, (Small, fury and duckbilled). In yellow there are the animals that have
duckbill as a feature.
Finally the two outside animals in blue, are some of the animals that would not fit in any of the
diagram.
Tree diagram:
When calculating probabilities you have numbers that have been multiplied, subtracted, divided,
and added so it can be very messy. In a tree diagram there are “Branches”, on the branches the
probability of each branch is written on the branch. The outcome is written at the end of the
branch. To work out the results you multiply probabilities along the branches, and you add
probabilities down columns. E.g.:
The numbers followed along the arrow you multiply the numbers, then after having all the numbers adding
them together which gave overall probability.
9|Page
Tables: To enter in or form into a list/table.
Frequency table :
Is constructed by arranging collected data values in ascending order of amount matching frequencies. The
frequency of data value is the number of times the data value occurs.
This example is four students have a score of 80 in mathematics, and then the score of 80 is said to have a
frequency of 4. The frequency of a data value is often represented by f.
6
4
7 5
10 6
7
8
7
8
8
9
7
5
6
6
9
4
7
8
To construct a frequency table:
1. Construct a table with three columns. The first column shows what is being arranged
2. Go through the list of marks, and put tally mark against the arranged numbers in the second
column. We continue this process until all marks in the list are tallied.
3. Count up the tally and put in under frequency in the table as follows.
10 | P a g e
Probability:
Probability measures or quantifies ‘how likely’ the event is likely to happen, Related to these types of experiment,
will happen. The value of a probability is a number between 0 and 1 inclusive. An event that cannot occur has a
probability (of happening) equal to 0 and the probability of an event that is certain to occur has a probability equal
to 1.
The probability can be written in originally in fractions: ½
This must be converted into a decimal: 0.5
Then into a percentage: 50%
Characteristics of shape
The characteristics of a shape shows the statistical techniques that have been developed to make decisions
about the probability of the data based on its shape.
Symmetrical Data: Symmetrical Data is the easiest type of data use. This is because many statistical
techniques have been developed for the same. In fact symmetrical data is so common that it is called the
normal curve. It also has other names like the bell curve. There are standard measurements available
which can tell the probability of a data point occurring based on the number of standard deviations it is
away from the mean. From a six sigma point of view it helps understand how the results of a process are
likely to be distributed.
Skewed Data: Many times the data is not symmetrical e.g. it is skewed towards one side. Data can be
either positively or negatively skewed. There are statistical techniques available which help us find out the
probability distributions of skewed data too. However such techniques are not very well developed. This is
because most of the sample data being collected usually follows the normal distribution. Statistical analysis
of skewed data is therefore not often performed.
11 | P a g e
Question 1: Sort the following into: Data, Discrete or Continuous. The first is already done.
Data
Discrete
Continuous
Misrepresentation
A) 14 cm
a.
b.
c.
d.
e.
f.
g.
h.
14 cm
2-4
178977 cm
4.7 or 5
7-9
3456789 mm
1.3 or 1
23 m
Answers and Explanation
Data is often used in values or measurements. Discrete is the data that has been rounded up to have a full
number. Continuous data is for data that falls between a range (e.g. People’s heights could be any value
(within the range of human heights), not just certain fixed heights.) And finally a Misrepresentation is
when the data is presented incorrectly.
Data
Discrete
Continuous
Misrepresentation
A) 14 cm
D) 4.7 or 5
B) 2-4
C) 178977 cm
H) 23 m
G) 1.3 or 1
E) 7-9
F) 3456789 mm
12 | P a g e
Question 2: Answer the following with use of the graph.
a.
b.
c.
d.
e.
How many students have rabbits?
How many have snakes?
How many have dogs or fish?
How many students were surveyed all together?
Which graph was easier to read?
School Pets
Number of students
70
60
50
40
30
20
10
0
Cats
Dogs
Birds
Mice
Fish
Snakes
Types of pets
School Pets
Animals of choice
Snakes
Fish
Mice
Birds
Dogs
Cats
0
10
20
30
40
50
60
70
Axis Title
Answers and explanation: The column graph, shows data with a vertical bar and the bar graph shows data in a
horizontal lines.
a. 0 there are no records of rabbits
b. Between 5–10. (6)
c. 113 dogs and fish students all together
13 | P a g e
d. 160 students surveyed altogether
e. Is up to personal preference.
Question 3: Answer the following questions about the pi graph.
a.
b.
c.
d.
What is the percentage of the birds, mice and snakes?
What is the percentage of fish and dogs?
What is the full pi percentage?
Is the pi graph the best to represent this data?
School Pets
Snakes
4%
Cats
13%
Fish
39%
Dogs
31%
Mice Birds
5%
8%
Answer and explanation: A pie chart is a circular chart divided into sectors; each sector shows the relative size of
each value.
a.
b.
c.
d.
17%
70%
100%
Personal opinion
14 | P a g e
Question 4: Answer the following from the data given.
Q1= 6
Q2=
a.
b.
c.
d.
Q3=16.5 Q4=
Q5= 63
With five quartile summary given can you identify the Quartile 2 and Quartile 4?
Find the average of the data given?
Determine the quartiles for the following set of data, 13, 90, 3, 12, 24, 0?
Looking at the box and whisker plot, below what are the Quartile 1 and Quartile 5?
Answers and explanation: To draw a box and whisker plot you need your numbers to be in
ascending or descending order. Then follow the five quartile summary steps. To work out the quartiles
follow these steps: Q1: The lowest number in the setQ2: Lower Quartile- The lower quartile is the
median of the lower half of the data. Q3: The median of the set of numbers. Q4: Upper Quartile The upper quartile is the median of the higher half of the data. Q5: The highest number in the set of
numbers
a. Q2= 8 and Q4= 50
b. 26.6 is the average. To work it out you add them all together than divided by the amount of digits.
c. Q1=0 Q2=3 Q3=12.5 Q4=24 Q5=90
15 | P a g e
Question 5: Answer the following questions with the set of numbers below.
Find the:
a.
b.
c.
d.
Range
Median
Mode
Mean
3, 3, 3, 5, 19, 34, 49, 62
Find the:
a.
b.
c.
d.
Median
Range
Mode
Mean
Answers and explanation: Mean: Sort out the number in ascending order, add them together and divide the
outcome by the total amount of numbers overall in the line. Median: The median is the middle number; Sort out the
numbers in ascending order. Count the number of digits in the list. Mode: The mode is the number that is repeated
more than any other number. In this example three is most used number. Range: The range is the highset number
from the lowest number.
Find the following for: 2, 2, 6, 8, 10, 12, 14, 16, 18, 20
a.
b.
c.
d.
18
12
2
10
Find the following for: 3, 3, 3, 5, 19, 34, 49, 62
a.
b.
c.
d.
19
59
3
22
16 | P a g e
Question 6: Answer the following questions using the information provided
Animals
With claws
With stripes
Tiger
Quagga
Cats
Skunk
s
1. How many animals have stripes?
2. How many animals have claws?
3. How many have both?
Answers and explanation: The diagram is a way to picture the connection between different groups of
things. The diagrams consist of two overlapping circles.
1. How many animals have stripes? 5
Look at the circle labelled with stripes and count how many results are entered into the circle. The same
goes for the other circle.
2. How many animals have claws? 5
Then the middle (between the two circles ) gives the an impression on what has both of the traits given in
the diagram.
3. How many have both? 3
17 | P a g e
Question 6 part b:
Main meal
Fish
Chicken
Beef
Desert
Fish, Fruit
Chicken, Fruit Beef, Fruit Fruit
Fish, Lollies Chicken, Lollies Beef, Lollies Lollies
a.
b.
c.
How many chicken meals?
How many different combination have Lollies?
How many different combinations can you have all together?
Answers and explanation: A Lattice diagram is used when determining all the possible outcomes of
a situation that involves two sides. One set of outcome is written at the top of grid and the other
down the side of the grid. The combined outcome is then filled in the crossing grid boxes according
to the consistent outer labels.
a. 2
b. 3
c. 6
18 | P a g e
Question 6 part c:
Fish 1/3
Fruit 1/2
Chicken 1/3
Beef 1/3
Main meal
Fish 1/3
Lollies 1/2
Chiken 1/3
Beef 1/3
a.
b.
c.
d.
How many chicken meals?
How many different combinations have Lollies?
How many different combinations can you have all together?
What is the probability of having a chicken meal with fruit?
Answers and explanation: When calculating probabilities you have numbers that have been multiplied,
subtracted, divided, and added so it can be very messy. In a tree diagram there are “Branches”, on the
branches the probability of each branch is written on the branch. The outcome is written at the end of the
branch. To work out the results you multiply probabilities along the branches, and you add probabilities
down columns.
a.
b.
c.
d.
2
3
6
0.75 or ¾ or 75%
19 | P a g e
Question 7: Read
Dogs
Cats
Fish
Rabbit
the results in the frequency table and answer the questions.
llll llll lllll
lll llllll
llll llll
lllll llll
a. How many dogs are there?
b. How many people were surveyed all together?
c. What mistake have the survey and table made?
Answers and explanation: The frequency table is constructed by arranging collected
data values in ascending order of amount matching frequencies. The frequency of
data value is the number of times the data value occurs.
This example is four students have a score of 80 in mathematics, and then the score of 80 is said
to have a frequency of 4. The frequency of a data value is often represented by f.
a. 13
b. 39
c. The 5th line needs to strike through.
20 | P a g e
Question 8: Answer the following questions Probability definition of P ( E) = n (E ) /n (S)
a.
b.
c.
d.
What is the percentage of the probability 0.34?
What is the decimal if the probability is 2/5?
Convert 75% probability into a fraction?
If there is four different coloured balls, a yellow, a red , a blue and a green, in a bag. What is the
probability of pulling out a green ball?
Answer and explanation: Probability measures or quantifies ‘how likely’ the event is likely to happen, Related to
these types of experiment, will happen. The value of a probability is a number between 0 and 1 inclusive. An event
that cannot occur has a probability (of happening) equal to 0 and the probability of an event that is certain to occur
has a probability equal to 1. The probability can be written in originally in fractions: ½ this must be converted into a
decimal: 0.5 Then into a percentage: 50%
a. 34%
b. 0.20
c. ¾
d. ¼ or 0.25 or 25%
Question 9: Name which data is being used in the following diagrams
a.
B.
a. Skewed
b. Symmetrical
Answers and explanation: Symmetrical Data. In fact symmetrical data is so common that it is called the
normal curve. It also has other names like the bell curve. There are standard measurements available
which can tell the probability of a data point occurring based on the number of standard deviations it is
away from the mean. Skewed Data: Many times the data is not symmetrical e.g. it is skewed towards one
side. Data can be either positively or negatively skewed. Statistical analysis of skewed data is therefore not
often performed.
21 | P a g e
Test and reflections
Before the marked test reflection: The test went really well but I’m glad that there was a practice test because
before that it was hard to create situation that I could work out solutions too. The questions that I had chosen to
create were too easy and they didn’t challenge me. I think it was because I spent a lot of time making a ‘text book’
and discovering new things. It’s taken a longer time because I feel that I have to learn the basics of the information
before I can start learning something new. During the test, I had problems with the histogram, I’m not too sure that
the answer is correct, because the journal entry for it was very brief and it isn’t a graph that I would use often.
Another part of the test I was unsure about was the reasonableness and the possible limitations, it wasn’t included
in my CML so I had answered with what I thought it meant, based on the supposed meanings. The rest of the
questions I had a lot of practice with, (the workshop on drawing a box and whisker plot on the calculator helped a
lot), so I felt pretty confident.
After the marked test:
The test went well like I had predicted in the previous reflection. I was very happy with my efforts when I reserved a
B. This showed that I am slowly getting into a working study routine. The question that I had wrong was because I
didn’t read the question correctly; this led me to using the wrong formula for the data output and getting the
question wrong. I really benefited from the detailed journal entries and the practice testing that was given beforehand it allowed me to see what parts I was lacking and where I needed to improve on for next time.
Looking at the rubric I think that I could of improved on Explaining the mathematical ideas and logical reasoning but
writing out all of the formulas that I had used not just the ones that were needed. I had written all my answers in
shortened form (e.g. 2/10 – 1/5) this may have pushed my grade down lower.
Overall I’m really happy with this grade, Next time I want to improve on reading the rubric and having results
compliant to it.
Finished CML reflection: During this CML I focused on the reflective side of the journal a lot more than last time. I
had even tried to do it electronically; this led to more flexibility when it came to layout and information. I had also
taken more time to work on the modelling task and less time on the things that I had known previously. I am fairly
confident currently with the topic of statistics and probability now, but the process of the CML is still trial and error
currently.
Bibliography
http://easycalculation.com/statistics/box-plot-grapher.php
http://www.eduplace.com/graphicorganizer/pdf/venn.pdf
http://www.mathsisfun.com/
http://www.mathswarehouse.com
http://www.mathsteacher.com
http://www.purplemaths.com
http://www.maths.andyou.com
22 | P a g e
Remember
When is a population normal?
Understand
Because theoretical means to estimate or guess
were as experimental means to test the theory or
guess. So usually the experimental probabilities
is the better solution
What can you tell about a population by considering;

Shape of the overall distribution,

Symmetry of sample distribution,

Skewness of the distribution.
Analyse
Explain why theoretical and experimental probabilities
can give different values for the same situation.
SACE Stage 1 (Year 11)
Compare the accuracy of the data that can be found
using the entire population and the Central Limit
Theorem to on a sample.
Evaluate
Pre-SACE (Year 10)
Bloom’s
cognitive
stuff
Conceptual checklists
Discuss the benefits of using the empirical rule (68%,
95.4%, 99.7%) for a Normal Distribution. Consider
accuracy, ease of use and anything else you can
think of.
Suggest several situations that can be described
using each of the following:

Discrete Quantitative Data
When data can only use certain values that can
be split. Like when calculating the number of
students in a classroom, you can’t have 23.5
students because it’s impossible to have a half of
a student.

Continuous Quantitative Data
When data is needed to be placed within a range
of values. For example people’s heights can vary
but when there is a lot of data you can create a
range between the basic points.

Categorical Data
Categorical data is when a set of data is sorted
into different categories according to the data’s
features. So instead of having just girls the data
may be sorted into girls with brown eyes/or girls
with blonde hair.
In what situations would you use each of the three
measures of “average”, mean, median and mode, to
show the different characteristics of a population or
sample? When using central tendency, it can be
used to discover the population mean. The other
situation using the three measurements would be
when working out the frequency distribution.
When is it reasonable to use each of the following to
describe a population?
 the five number summary used to describe a
box-and-whisker plot,
The box and whisker plot is often used to
plot a large amount of numerical data. For
example height.
23 | P a g e
YEAR 11 LEARNING OUTCOMES
Before the CML: I am quite disappointed with my last test result so this time I want my journal to be more affective
in the way that I will be able to use the journal and be able to explain my understandings of the work done. This year
the year ten things covered are a bit differently to the work done last year so I have looked into those outcomes to
make sure that I have a stable understanding of this work. Looking over the dot points I think the hardest point
would be the combination notation & formula and calculate the compound probabilities.
Populations and sample size:
Population: The group from which a sample has been taken from. It is the thing too big to be tested fully so samples
have to be taken. For example: Having the ‘population’ of children at the royal Adelaide show. If an organiser
wanted to see the statistics of children’s enjoyment, He/she would want to take a sample out of the population.
Sample size: A selection taken from the population or larger group so to find something about that population or
group. For example: a scientist wants to know the salinity of the sea, the scientist doesn’t test the whole sea instead
he/she takes a sample from the population or the sea in this place and tests that. The sample size can vary
depending on what’s been tested and the accuracy of the test itself.
How do we determine the sample size?
There are four way to ensure that we have a sufficient sample size from our population.
1. Population size- How much is there in total? If you don’t now it’s not strange most populations’ sizes aren’t
often exact.
2. Margin of error- (also called confidence Interval) – This means to leave room for error, your sample will
never be perfect so you have to have room for error that might occur. The confidence interval determines
how much higher or lower than the population mean that you’re going to let you sample mean fall.
3. Confidence Level — How confident do you want to be that the actual mean falls within your confidence
interval? The most common confidence intervals are 90% confident, 95% confident, and 99% confident.
4. Standard of Deviation — How much variation do you expect in your responses? If we don’t know the
approximate variation the recommended variation to use is 0.5.
To calculate the needed sample size, your level corresponds to a Z-score. This constant value needed for this
equation. Here are the Z-scores for the most common confidence levels:



90% - Z Score = 1.645
95% - Z Score = 1.96
99% - Z Score = 2.32
Now use this equation to calculate the necessary sample size = (Z-score)² * StdDev*(1-StdDev) / (margin of error)²
This example works when assuming the confidence level stands at 95% confidence level, 5 standar deviation and the
margin of error of +/-5%
24 | P a g e
((1.96)² x .5(.5)) / (.05)²
(3.8416 x .25) / .0025
.9604 / .0025
384.16
385 respondents are needed
Independent VS Dependent events
Dependent: The dependent events are those that depend on what happened before.
For example: 2 blue marbles and 3 red marbles are in a bag, the chance of getting a blue marble out is 2 in 5 but if
you take one of the marbles out then the chance changes.
This gives us a dependent event because the outcome of the chance now depends on the marble that was taken out
or on the event that happened before.
Independent: Independent events are not affected by the previous events.
For example: You toss a coin and it comes up with heads three times, what is the chance that the next toss will also
be a head? The chance is still 50% just like any other toss of a coin. The previous times the coin was tossed has no
effect on the current toss.
Conditional statements
A conditional statement is a statement with ‘if’ and ‘then’, like a hypotheses. If I add fertilizer to the plant then I will
grow better. The ‘if’ is the hypothesis and the ‘then’ is the conclusion. This can also be known as a p->q, if p then q. A
conditional statement is false if the hypothesis is true and the conclusion is wrong, for example: If I add fertilizer to
the plant then it won’t grow. If the conditional statement is rearranged then it’s called a related conditional.
For example: If a population consists of 50% men then 50% of the population must be women: p -> q but changing
the statement to If a population consists of 50% women then 50% of the population must be men: q->p this is a
converse statement.
If both statements are true, or if both statement are false then the converse is true, a conditional and converse don’t
mean the same thing. If both hypothesis and conclusion contradict then we get an inverse statement: For example if
the population does consist of 50% male then the population doesn’t consist of 50% female. ~p -> ~q
A pattern of reasoning is a true assumption if it always leads to a true conclusion. Most common patterns of
reasoning are detachment and syllogism.
Given: Know or independently determined: a given ratio
Instead of “and” there is often the ∩ symbol (which is also the intersection symbol in venn diagrams). Instead of “or”
you will often see the union symbol ∪ .
For example: If the probability of scoring no goals is 20% (Event A) and scoring 1 goal is 15% (Event B). The
probability of scoring no goals and 1 goal is 0 (impossible), but the probability of scoring no or 1 goal is 20% + 15% =
35%. This is written:
25 | P a g e
P (A ∩ B) =0
P ( A ∪ B) = 20% + 15% = 35%
Dog owners D and Cat owners C
n(C) = 5+3 = 8
n (D)= 3+2=5
C∩D= Cat intersection dog
= People who own both cats and dogs
n(C∩D) = 3
C∪D = Cat union dog
= people who own cats of dogs or both
n(C∪D)= n (C) + n (D) – n (C∩D) <- Otherwise counted twice
Dc or D’= The component of D = those who don’t own a dog = 5+10=15
n(C∩D’) = 5
n (C∪D’)= 18
n((C∪D)’) = 10
n (C’ ∪ D’)= 17
Mutually Exclusive events
Mutually Exclusive: Means that it can’t happen at the same time. For example: You can turn left and right at the
same time. This is because it’s impossible to turn left and right at the same time.
The probability of a mutually exclusive event depends on the number of ways it can happen / the total number of
outcomes. An example of this is having a deck of 52 cards, with 4 kings. What is the probability of picking a king?
26 | P a g e
Number of ways it can happen= 4/ Total number of outcomes = 52
So the probability is equal to 4/52 or 1/13
When two events are mutually exclusive it is impossible for them to happen together:"The probability of A and B
together equals 0 (impossible)" But the probability of A or B is the sum of the individual probabilities:
P(A or B ) = P (A) + P (B)
The probability of A or B equals the probability of A plus the probability of B
For example: In a deck of 52 cards, the probability of a king is 1/13 so P(KING) = 1/13, The probability of an Ace is
also 1/13 so P(ACE)= 1/13
When we combine the two events the probability of getting an Ace and a King is 0 or Impossible. But the probability
of a card being a king or an Ace is 2/13.
P(King and Ace)=0
P (King or Ace)= (1/13 )+ (1/13) = 2/13
Conditional Probability and Independence
The probability theory, says that a conditional probability measures the probability of an event given that (by
assumption, presumption, assertion or evidence) another event has occurred. If the events are A and B, this is said to
be “the probability of A given B”. It is commonly denoted by P(AǀB) or sometimes PB(A). Both A and B are categorical
variables, conditional probability table is typically used to represent the conditional probability. The concept of
conditional probability is one of the most fundamental and one of the most important concepts in probability
theory. But conditional probabilities can be hard and require careful interpretation. In statistical inference, the
conditional probability is an update of the probability of an event based on new information. Incorporating the new
information can be done as followed:
1. Start with a probability measure on a sample space
2. Measure the probability of the event A knowing that event B has or will have occurred we need to
examine event A as it is restricted to event B as it is A B.
3. Whenever P(B) > 0 with the original probability measure on the original sample space, B must be the
sure event in the restricted space and thus PB(B) must be 1.
4. To derive P(A|B)= PB (A) so that P(B|B) =1 we re-scale P (A B) by dividing by P(B)
5. This gives P(A|B)= P(A B) |P(B) whenever P(B)>0 and 0 otherwise
Everyday occurrences to illustrate probability
The bus being on time:
P(A and B) = P ( Bus being on Time and Bus not being on time)
P(1/2) + (1/2) = 2/2 = 1
Probability of the event happening= 0.5
Probability of the event not happened= 0.5
27 | P a g e
Waking up before my alarm:
Probability of the event happening= 0.5
Probability of the event not happened= 0.5
P(a) = 1
P(b)= 0.5
P(A n B) = 1
P(A’)=0
P(B’)=0.5
P(A B) = 1
Having a relief teacher for class on Monday:
Number of teachers on Monday: 4
Likelihood of having a relief teacher: 0.5
0.5/4= 0.125
Probability of having a relief for class on Monday: 0.125 – 12.5%
Probability event occurs P (A)= 0.008
Probability the event doesn’t occur= 0.992
P(A)= 0.004
P(B)= 0.004
P(A’)= 0.996
P(B’)= 0.996
P (A U B) = 0.008
Combination of notation and formula( nr) = n!/r!( n – r)!
To Calculate the combination for (nr)=n!/r!(n-r). For 0 <=r <=n
N chooser = C(n,r) = n! / ( r! (n - r)! )
The formula shows the number of ways a sample of ‘r’ elements can be obtained from a larger set of
‘n’ distinguishable objects where order does not count and repetitins are not allowed. ‘The number of
ways of picking r unordered outcomes from n choose k.
Combination Problem 1: A teacher is going to choose 3 students from her class to compete in the
spelling bee. She wants to figure out how many unique teams of 3 can be created from her class of
25.
In this example, we are taking a sub-set of 3 students (r) from a larger set of 25 students (n).
Looking at the formula, we must calculate “25 choose 3”
C (25,3)= 25!/(3! * (25-3)!)= 2,300 Possible Teams
28 | P a g e
Relative frequencies from data as point estimates of probabilities.
The probabilities cannot be calculated by just looking at the situation, for example, you cannot work out the
probability of winning a football match by assuming that win, lose and draw are equally likely, but we can look at
previous results in similar matches and use these to estimate the probability of winning.
For example: The bumbleton and Sickton village football teams have played each other 50 times. Bumbleton have
won 10 times and Sickton have won 35 times and the teams have drawn times. Sickton have won 35/50 or 7/10
matches. This fraction is the estimate of the probability. So the relatibe frequency is equal to 7/10. We calculate the
relative frequency of an outcome using this formula:
Calculate compound probabilities using:
P(A and B) = P (A) x P(B)
We use P to mean Probability of, so when talking about Independent events we use:
P(A and B) = P(A) x P(B)
The probability of A and B equals the probability of A times the probability of B
For example: You are going to a concert , and your friend says it is some time on the weekend between 4 and 12 but
won’t say more. What are the chances it is on Sunday between 10 and 12?
Day: there are two days on the weekend so P(Sunday)= 0.5
Time: Between 4 and 12 is 8 hours, but you want between 10-12 which is only 2 hours: P(Your time) = 2/8 = 0.25
P(Sunday and Your time) = P(Sunday) x P(Your Time) = 0.5 X 0.25 = 0.125 or 12.5% chance
29 | P a g e
Year 11 reflections
After CML: After this CML I feel a little more confident with the outcomes. The questions on the CML where well laid
out making it easier for me to understand and make necessary adaptions to the CML where I needed it most. The
confusion about the due dates made me procrastinate a bit, and I was a little slack with the test reviewing. I enjoyed
not having a modelling task to go with this CML but at the same time I am quite curious as to what it would be.
After the Test: I think that I did pretty well in this test, compared to other tests. I had a wide range of concepts
covered in my CML and this test was better worded then last time. I felt I understood most of the questions, but
wished that I had more time or the time was marked on the board so that I can balance the time out. Instead I ran
out of time.
ILM
Last year my ILM had Blue score is 8, this hasn’t changed. This is good because the logical behaviour is used a lot
within the school and maths included. My Score for being Holistic, Intuitive, Synthesizing and Integrating, is lower
than last year. I think this is because I spend a lot of time being on my own, or working on my own. My emotional
interpersonal self, has increase from 4 to 10. This is good because I can see that I have improved myself over the
year that I have been at ASMS. I worked hard on trusting people and building relationships with them this has
improved my weariness about people and I am able to interact on a better level. My organizational quadrant has
also improved by one point. But I would like to have it improve further to minimize the stress of assignments and
things.
30 | P a g e