Download Lab #2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
LAB 2 – Random Variables, Sampling Distributions of Counts, and
Normal Distributions
The ECA 225 has open lab hours if you need to finish LAB 2. The lab is open
Monday-Thursday 6:30-10:00pm and Saturday-Sunday 2:00-6:00pm
To download R onto your own personal computer, go to:
http://cran.r-project.org/bin/windows/base/
Click on the link for R-2.6.1-win32.exe. Save the file to your computer. Then
click on the file to start the installation to your computer.
Your submission to LAB 2 should consist of answering the numbered questions as you
work through the Lab.
***AS YOU ARE WORKING THROUGH THE LAB, copy and paste each output into a
blank word file****
You can either print the completed word file out and turn that in, or you can e-mail the
word file to me for you LAB 2 grade.
Everything MUST be done in R and included in your word file.
***********************************************************************
***********************************************************************
Access R
On the desktop or through the Programs Menu, find the R icon and click on it.
You should be brought to a screen with a command prompt:
Random Variables
You can use R to simulate certain Random Variables. For example, if X is equal
to the outcome of tossing a coin, you could use 0 to represent a head and 1 to represent a
tail. R can simulate tossing a coin ten times with the following command:
>sample (0:1, size = 10, replace = TRUE)
output: [1] 0 0 1 1 1 1 1 0 1 0
To read the above results, in the ten tosses of the coin, we had 4 heads and 6 tails.
The 0:1 code is telling R to chose a discrete number between 0 and 1 inclusively, the size
= 10 code is telling R the number of repetitions, the replace = TRUE code is telling R that
sampling is done with replacement.
For another example, the following simulates the outcomes for X = the outcome
of tossing a six sided die 20 times.
>sample (1:6, size = 20, replace = TRUE)
output: [1] 1 2 1 5 4 6 4 1 4 3 1 5 1 2 5 6 3 3 1 2
For another example, the following simulates the outcomes for X = the sum of
tossing two six sided dice, 13 times.
>sample (1:6, size = 13, replace = TRUE) + sample (1:6, size = 13, replace = TRUE)
output: [1] 8 7 5 8 3 6 4 5 7 3 4 11 8
1. What would be the code to simulate Y = the sum of tossing a six sided die and a four
sided die, for 15 tosses?
2. What is your output to (1). Copy and paste R code and output
3. Simulate a lottery ticket that consists of choosing 5 numbers from the numbers 01 to
45. Include your R code and output (this type of sampling is done without
replacement)
Binomial Variables and Sampling distribution of a Count
We can use R to calculate probabilities and simulate samples from a binomial
distribution. For example, suppose the probability that a person aged 20 will be alive at
age 65 is 0.80. If we select 15 people aged 20, and X = the number that are alive at age
65. X has a Binomial distribution with n = 15 and p = 0.8. The probability that X = 7
can be found with R:
>dbinom(7, size=15, prob = 0.8)
output: [1] 0.003454764
The probability that X is at least 7 can be found with R:
>sum(dbinom(7:15,size =15, prob = 0.8))
output: [1] 0.999215
You can also draw a plot to represent the sampling distribution of this count:
>heights = dbinom(0:15, size=15, prob=0.8)
>plot(0:15, heights, type= “h”, main= “Spike plot of X”, xlab = “x”, ylab= “prob”)
output:
4. Pinworm infestation can be treated with a drug. According to a study, the drug is an
effective treatment to cure pinworm in 90% of cases. Suppose that 8 children with
pinworm are given the drug. What is the probability that 6 children are cured? Show R
code and output.
5. What is the probability that at least one child is not cured? Show R code and output.
6. Construct a graph to show the sampling distribution of X = the number of children in
the sample of size 8 that are cured.
7. How many children should be sampled so that the distribution of X is approximately
normal? Explain your reasoning and show a sampling distribution of X graph that is
approximately normal.
8. A fair coin is tossed 100,000 times. The number of heads is recorded. What is the
probability that there are between 49,800 and 50,200 heads, inclusively? Show R
code and output.
Normal Distributions
Just like a binomial random variable, we can simulate normal random variables
and calculate probabilities. For example, if you want to draw a normal distribution in R:
> curve(dnorm(x, mean=4, sd=0.5),2,6)
output:
Note: the “2,6” part of the code is just telling R what a good range for the
horizontal axis would be. You can put any reasonable range.
If you want to calculate the probability that X is less than 3 in a normal
distribution with mean 4 and standard deviation 0.5, use the following code:
>pnorm(3,mean=4,sd=0.5)
output: [1] 0.02275013
The code “pnorm” is asking R to find the area to the LEFT.
If you want to calculate the probability that X is greater than 3 in a normal
distribution with mean 4 and standard deviation 0.5, use the following code:
>1-pnorm(3,mean=4,sd=0.5)
output: [1] 0.9772499
If you want to calculate the probability that X is between 3 and 5 in a normal
distribution with mean 4 and standard deviation 0.5, you can use the following code
options:
>1-2*pnorm(3,mean=4,sd=0.5)
output: [1] 0.9544997
Notice that what it is doing is finding the area to the left of 3, multiplying that by
2 because the curve is symmetric, and then subtracting those two tails from one.
OR:
>pnorm(5,mean=4,sd=0.5)-pnorm(3,mean=4,sd=0.5)
output: [1] 0.9544997
OR:
>diff(pnorm(c(3,5),mean=4,sd=0.5))
output: [1] 0.9544997
9. Tarantula Carapace lengths are normally distributed with mean 18.14 mm and
standard deviation 1.76 mm. Draw a normal distribution curve in R for the carapace
lengths.
10. What is the probability that a randomly selected tarantula has a carapace length less
than 15 mm? Include R code and output
11. What is the probability that a randomly selected tarantula has a carapace length
greater than 20mm? Include R code and output
12. What is the probability that a randomly selected tarantula has a carapace length
within one mm of the population mean? Include R code and output