Download Statistics for Psychology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistics for Psychology
Patrick Murphy
Department of Statistics
Room L548
5th Floor Library Building
[email protected]
12 Lectures
2.00 pm Tuesdays
Theatre L
Textbook
Seeing Through Statistics
by
Jessica Utts
Duxbury Press
CLASS WEBPAGE
1. Go to the Statistics
Department Website
WWW.UCD.IE/~Statdept/
2. Then click on ClassPages
in the left frame
3. Finally click on
Statistics for Psychology
What do you
know about
statistics?
It’s boring…
Frogs
and Princesses
There are three
kinds of lies:
Lies
 Damned Lies
 and
 Statistics


- Benjamin Disraeli
A single death is a tragedy,
a million deaths is a
statistic.
Joseph Stalin (1879-1953)
The weaker the data
available upon which to
base one's conclusion, the
greater the precision which
should be quoted in order to
give the data
authenticity.
Norman R. Augustine
Simpsons episode:

Homer is questioned about his
newly formed vigilante group
Newscaster: Since your group
started up, petty crime is down
20%, but other crimes are up.
Such as heavy sack beating
which is up 800%. So you’re
actually increasing crime.
Homer: You can make up
statistics to prove anything.
43% of people know that.
Misuse of Statistics
The Great Meryl Streep Apple
Juice Cancer Scare
 Asbestos is really bad for you
so we need to eradicate it from
our buildings

Aeroplanes
1/1,000,000 chance of a bomb
on a plane
 Aeroplane Engines

What about Probability?


The foundation of Probability
theory lies in problems associated
with gambling and games of chance
The Romans used played a game
with ASTRAGALI - Heel bones of
animals
DICE

DICE as we know them were
invented around 300 BC
“I lied, cheated and stole to
become a millionaire. Now
anybody at all can win the
lottery and become a
millionaire”
LOTTO 6/42
 What
are the chance of
winning with one selection
of 6 numbers?
Matches
Odds
6
1 in 5,245,786
5
1 in 24,286
4
1 in 555
LOTTO 6/42

The average time to win each of the prizes is
given by:

Match 3 with Bonus
2 Years, 6 Weeks

Match 4
2 Years, 8 Months

Match 5
116 Years, 9 Months

Match 5 with Bonus 4323 Years, 5 Months

Share in Jackpot
25,220 Years
Why do people still play the
lottery?
If you’re not in you can’t win!
 You never know your luck
until you try!
 My chances of winning a
million are better than my
chances of earning a million.

 The
lottery is a tax on the
statistically challenged.
Lincoln & Kennedy
Abraham Lincoln was elected
to Congress in 1846.
 John F Kennedy was elected to
Congress in 1946.
 Abraham Lincoln was elected
President in 1860.
 John F. Kennedy was elected
President in 1960.
 The names Lincoln and
Kennedy each contain seven
letters.
 Both were particularly
concerned with civil rights.

Lincoln & Kennedy
Both wives lost a child while
living in the White House.
 Both Presidents were shot on a
Friday.
 Both Presidents were shot in the
head.
 Lincoln's secretary was named
Kennedy.
 Kennedy's secretary was named
Lincoln.
 Both were assassinated by
Southerners.

Lincoln & Kennedy
Both were succeeded by
Southerners named Johnson.
 Andrew Johnson, who
succeeded Lincoln, was born in
1808.
 Lyndon Johnson, who
succeeded Kennedy, was born
in 1908.
 John Wilkes Booth, who
assassinated Lincoln, was born
in 1839.
 Lee Harvey Oswald, who
assassinated Kennedy, was born
in 1939.

Lincoln & Kennedy
Both assassins were known by
their three names.
 Both names are composed of
fifteen letters.
 Lincoln was shot at the theatre
named 'Ford.'
 Kennedy was shot in a car
called 'Lincoln.'
 Booth ran from the theatre and
was caught in a warehouse.
 Oswald ran from a warehouse
and was caught in a theatre.
 Booth and Oswald were
assassinated before their trials.

Lincoln & Kennedy
And here's the clincher.
 A week before Lincoln was
shot, he was in Monroe,
Maryland.
 A week before Kennedy was
shot, he was in Marilyn
Monroe.
 Oh…and on the day he died
Lincoln pardoned a man
named…
 Patrick Murphy

Election:
Which parties have
most power?
Party A - 45%
 Party B - 44%
 Party C - 7%
 Party D - 4%

We’re ready to play
some games…
An Example
Experiment: Roll Two Dice
 Possible Outcomes: Any
number from 1 to 6 can appear
on each die.
 There are 36 possible outcomes
 Each Outcome in the Sample
Space is equally probable.
 So the probability of each
outcome is 1/36
 What is the probability of the
Event - “get combined total of 7
on the dice”

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
(1,6)
(2,5)
(3,4)
(4,3)
(5,2)
(6,1)
A more interesting example
Game Show
 “Who wants to win a Ferrari?”
 3 doors
 1 Car & 2 Goats
 You pick a door - e.g. #1
 Host knows what’s behind all
the doors and he opens another
door, say #3, and shows you a
goat
 He then asks if you want to
stick with your original choice
#1, or change to door #2?

Ask Marilyn.
Marilyn vos Savant
 Guinness Book of Records Highest IQ
 “Yes you should switch. The
first door has a 1/3 chance of
winning while the second has a
2/3 chance of winning.”
 Ph.D.s - Now two doors, 1 goat
& 1 car so chances of winning
are 1/2 for door #1 and 1/2 for
door #2.
 “You are the goat” - Western
State University.

Who’s right?

At the start, the sample space is:
{CGG,
GCG, GGC}
Pick a door e.g. #1
 1 in 3 chance of winning


Host shows you a goat so now
{CGG,

GCG, GGC}
So Marilyn was right, you
should switch.
Chapter 1
The Beginning
Statistics is the science of data.
This involves collecting, analysing
and interpreting information.
Descriptive Statistics uses
graphical and numerical
techniques to summarise and
display the information contained
in a dataset.
Inferential Statistics uses sample
data to make decisions or
predictions about a larger
population of data
More Definitions
Population: The entire collection of
individuals or objects about which
information is desired.
Sample: A part (subset) of the
population selected in some prescribed
manner.
Variable: A characteristic or property
of an individual unit in the population.
Representative Sample: A selection of
data chosen from the target population
which exhibits characteristics typical of
the population.
Representative samples should give
unbiased estimates
More Definitions
The most common way to select a
Representative Sample is to choose a
Random Sample.
A Random Sample is a sample
selected so that each different possible
sample of the desired size has an equal
chance of being the one chosen.
This implies that each member of the
original population has an equal chance
of being selected in any random
sample.
Descriptive vs
Inferential Statistics
Descriptive statistics is only interested
in describing a dataset, whereas
Inferential Statistics seeks to make a
decision based on the data.
An Example of
Descriptive Statistics
- UCD Faculties
Faculties
Faculty
Arts
Commerce
Law
Science
Engineering
Medicine
Architecture
Agriculture
# Students
4,438
2,129
463
1,868
1,142
1,185
289
950
# Degrees
1,153
424
120
327
229
218
79
130
# PG Degrees
342
395
43
106
88
63
22
0
Faculty
...
La
Sc w
ie
nc
En
e
gi
ne
...
M
ed
ic
in
A
rc e
hi
t..
.
A
gr
ic
u.
..
C
om
A
rt
s
5000
4000
3000
2000
1000
0
# Degrees
1500
1000
500
0
Arts Co Law Scie Engi Med Arc Agri
mm
nce nee icin hite cult
erc
ring e ctur ure
X
e
e
Degrees/Student
0.3
0.25
0.2
0.15
0.1
0.05
0
Arts Co Law Scie Engi Med Arc Agri
mm
nce nee icin hite cult
erc
ring e ctur ure
e
e
By using Descriptive Statistics to display the
data in this manner we can now analyse the data
more easily to find trends or patterns which were
not immediately obvious in the original dataset.
The Basics of Inferential
Statistics - An Example
A Newspaper wants to know whether
people are happy with the performance
of the Government. They hire a
company to conduct an opinion poll.
The pollsters select 1000 people and
ask them the question: “Are you happy
with the performance of the
Government?”
The Newspaper prints a headline like
the following:
“70% want the Government to go”
or
“Government achieves record
popularity among voters”
How can the newspaper publish things
like this?
They have only got the opinions of less
than 1000 people ( remember the
“don’t knows”).
1000/2.3 Million = 0.00043 or 0.043%
Before the end of this course we will
find out in great detail whether we
should believe these polls.





For the moment lets examine the
procedure carried out in this example.
The newspaper is interested in a certain
population. What is this Population?
The newspaper wants to measure some
variable for each unit of the population.
What variable do they want to
measure?
The opinion pollsters decide to select a
sample from the population. What is
the sample?
And what is so special about the
sample chosen?
Is the result reliable?
How to collect data.


Before we can begin making inferences about
the data we need to collect the data itself.
Usually one gets data in one of 4 different
ways.
Data from a published source
The data has already been collected and the
results published, all we need do is draw
conclusions from the data. This is where
politicians and economists get most of their
data. A boring way to get data!!!
Data from a designed experiment
Here you design and conduct an experiment to
measure some characteristic of a population.
You have strict control over how the
experiment is carried out. This is the way
scientists collect their data and it is the
method which should provide the most
accurate results.
How to collect data continued...


Data from a survey
Here you select a representative sample of
people from the population you are interested
in. You ask each person some questions and
record their answers. This method is used by
polling companies, government statisticians
etc. It has certain obvious drawbacks relating
to the truthfulness of responses.
Data collected observationally
Here one observes the sample in its normal
environment and records the variables of
interest. Used by biologists and psychologists.