Download C17_Math3033

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Math 3033
A Modern Introduction to Probability
and Statistics
Understanding Why and How
Chapter 17: Basic Statistical Models
Slides by Dan Varano
Modified by Longin Jan Latecki
17.1 Random Samples and Statistical
Random Sample: A random sample is a
collection of random variables X1, X2,…, Xn,
that that have the same probability
distribution and are mutually independent
If F is a distribution function of each random
variable Xi in a random sample, we speak of a
random sample from F. Similarly we speak of a
random sample from a density f, a random
sample from an N(µ, σ2) distribution, etc.
17.1 continued
Statistical Model for repeated
A dataset consisting of values x1, x2,…, xn of
repeated measurements of the same quantity is
modeled as the realization of a random sample
X1, X2,…, Xn. The model may include a partial
specification of the probability distribution of each
17.2 Distribution features and sample
Empirical Distribution Function
Law of Large Numbers
Fn(a) =
# ( X i  ( , a ])
lim n->∞ P(|Fn(a) – F(a)| > ε) = 0
This implies that for most realizations
Fn(a) ≈ F(a)
17.2 cont.
The histogram and kernel density estimate
# ( Xi  ( x  h, x  h ])
≈ f(x)
Height of histogram on (x-h, x+h] ≈ f(x)
fn,h(x) ≈ f(x)
17.2 cont.
The sample mean, sample median, and
empirical quantiles
Ẋn ≈ µ
Med(x1, x2,…, xn) ≈ q0.5 = Finv(0.5)
qn(p) ≈ Finv(p) = qp
17.2 cont.
The sample variance and standard
deviation, and the MAD
Sn2 ≈ σ2 and Sn ≈ σ
MAD(X1, X2,…,Xn) ≈ Finv(0.75) – Finv (0.5)
17.2 cont.
Relative Frequencies
for a random sample X1,X2, . . . , Xn from a
discrete distribution with probability mass
function p,one has that
# ( X i  a)
≈ p(a)
17.4 The linear regression model
Simple Linear Regression Model: In a simple
linear regression model for a bivariate dataset
(x1, y1), (x2, y2),…,(xn, yn), we assume that x1,
x2,…, xn are nonrandom and that y1, y2,…, yn
are realizations of random variables Y1, Y2,…,
Yn satisfying
Yi = α + βxi + Ui for i = 1, 2,…, n,
Where U1,…, Un are independent random
variables with E[Ui] = 0 and Var(Ui) = σ2
17.4 cont
Y1, Y2,…,Yn do not form a random sample.
The Yi have different distributions because
every Yi has a different expectation
E[Yi] = E[α + βxi + Ui] = α + βxi + E[Ui] = α + βxi
Related documents