Download E 243 Spring 2015 Lecture 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Probability amplitude wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Lecture 6
In the last lecture we went over discrete random variables. Now we will transform the
concepts to understand continuous random variables.
Continuous random variables
The probability distribution of a continuous random variable is described in terms of a
probability density function, f(x). It must be noted f(x) does not give us probability. It
gives us probability per unit value of x. That is why we call it probability density
function.
So the probability that X lies between a and b is given by,
𝑏
𝑃(𝑎 < 𝑋 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎
Also, it must be noted,
1) f(x) ≥ 0
∞
2) ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
Since, P(X=x) = 0, the following holds true,
𝑃(𝑥1 ≤ 𝑋 ≤ 𝑥2 ) = 𝑃(𝑥1 < 𝑋 ≤ 𝑥2 ) = 𝑃(𝑥1 ≤ 𝑋 < 𝑥2 ) = 𝑃(𝑥1 < 𝑋 < 𝑥2 )
Example
Determine the value of k in the following probability density function. Find the
probability that X lies 0 and 1.
f(x) = k (1+2x) for 0 < x < 2,
= 0, otherwise
Solution:
∞
∫ 𝑓(𝑥)𝑑𝑥 = 1
−∞
2
𝑥=2
∫ 𝑘(1 + 2𝑥)𝑑𝑥 = (𝑘𝑥 + 𝑘𝑥 2 )𝑥=0
= 2𝑘 + 4𝑘 = 1
0
k=1/6
𝑥=1
Now, P (0 < X < 1) = 𝑘(𝑥 + 𝑥 2 )𝑥=0
= 0.33
Cumulative distribution function
Another way to define the probability distribution of a random variable is in terms of
probability that X is less than or equal to x.
𝑥
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓(𝑢)𝑑𝑢
−∞
With this definition,
𝑃(𝑎 < 𝑋 < 𝑏) = 𝐹(𝑏) − 𝐹(𝑎)
Example (continued)
Determine F(x). Find probability that X lies between 0 and 1.
Solution:
𝑥
𝐹(𝑥) = ∫
0
𝐹(𝑥) =
1 + 2𝑢
𝑑𝑢
6
𝑥 + 𝑥2
6
P (0 < X < 1) = F(1) – F(0) = 2/6 + 0/6 = 0.333
Mean and Variance
The mean value of any random variable X is also called the expected value X and is
denoted by E(X) or μ. The mean calculated from any random sample is denoted by 𝑥̅ .
The mean value of any data set is given by summing up all the values and dividing by
the total number of values. If the same value occurs more than once, we sum the
number of times that it occurred. This translates to,
𝐸(𝑋) =
∑𝑎𝑙𝑙 𝑥 𝑥𝑛𝑥
= ∑ 𝑥𝑝(𝑥)
𝑁
𝑎𝑙𝑙 𝑥
This can be translated into the form of the continuous variable as,
∞
𝐸(𝑋) = 𝜇 = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞
Similarly, the variance of a random variable X is denoted by V(X) and σ2. It is given by
the average of the square of deviations of each data point from the mean (a measure of
spread). The square root of the variance is the standard deviation (σ). The standard
deviation has the same dimensions as the mean. The standard deviation computed from
a random sample is denoted by s.
𝑠2 =
∑𝑎𝑙𝑙 𝑥 𝑛𝑥 (𝑥 − 𝑥̅ )2
= ∑ 𝑝(𝑥)(𝑥 − 𝑥̅ )2
𝑁−1
𝑎𝑙𝑙 𝑥
This can be translated into the form of a continuous variable as,
∞
𝑉(𝑋) = ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥
−∞
∞
𝑉(𝑋) = ∫ (𝑥 2 + 𝜇 2 − 2𝑥𝜇) 𝑓(𝑥)𝑑𝑥
−∞
∞
∞
∞
𝑉(𝑋) = ∫ 𝑥 2 𝑓(𝑥)𝑑𝑥 + ∫ 𝜇 2 𝑓(𝑥) 𝑑𝑥 − ∫ 2𝑥𝜇𝑓(𝑥) 𝑑𝑥
−∞
−∞
−∞
𝑉(𝑋) = 𝐸(𝑋 2 ) + 𝐸(𝑋)2 − 2𝐸(𝑋)2 = 𝐸(𝑋 2 ) − 𝐸(𝑋)2
Example (Continued)
Find the mean and variance of X.
Solution:
∞
𝐸(𝑋) = 𝜇 = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞
2
𝐸(𝑋) = 𝜇 = ∫
0
𝑥(1 + 2𝑥)
𝑑𝑥
6
𝑥=2
𝑥2 𝑥3
𝐸(𝑋) = ( + )
= 1.222
12 9 𝑥=0
𝑉(𝑋) = 𝜎 2 = 𝐸(𝑋 2 ) − 𝜇 2
2
𝐸(𝑋
2)
=∫
0
𝑥 2 (1 + 2𝑥)
𝑑𝑥
6
𝑥=2
𝐸(𝑋
2)
𝑥3 𝑥4
16
=( + )
=
18 12 𝑥=0
9
𝑉(𝑋) = 𝜎 2 =
16 121 23
−
=
= 0.284
9
81
81
Population Median, quartiles and percentiles
The pth percentile of a random variable X is the value xp such that p% of the values of X
are below it. That is,
𝑥𝑝
∫ 𝑓(𝑥)𝑑𝑥 =
−∞
𝑝
100
The median value of X is xm such that 50% of the values of X are below it and 50% are
above it. That is,
𝑥𝑚
∫ 𝑓(𝑥)𝑑𝑥 = 0.5
−∞
The first quartile of X is xq1 such that 25% of the values of X are below it. That is,
𝑥𝑞1
∫ 𝑓(𝑥)𝑑𝑥 = 0.25
−∞
Similarly the 3rd quartile maybe defined as the xq3 such that 75% of the values of X are
below it.