Download lec1 - Department of Statistical Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STAT 113
Probability and statistics
Instructor: Sayan Mukherjee
TA: Quanlin Li
Sta 114, spring 2008
Perspectives on stats
There are three kinds of lies: lies, damned lies, and statistics.
B. Disraeli
Sta 114, spring 2008
What is probability ?
Probability is a branch of mathematics that deals with
calculating the likelihood of a given event's occurrence,
which is expressed as a number between 1 and 0.
Sta 114, spring 2008
What is statistics ?
Statistics derives from: Latin -- statisticum collegium ("council of state")
Italian -- statista ("statesman" or "politician").
Statistik: German first introduced by Gottfried Achenwall (1749), originally
designated the analysis of data about the state, or the
"science of state". Acquired the meaning of the collection
and classification of data generally in the early 19th century.
Statistics as inverse probability -- estimating parameters from experimental
data
Sta 114, spring 2008
Well-posed problems
Inverse problems are typically ill-posed
A problem is well-posed if its solution
• exists
• is unique
• is stable, eg depends continuously on the data
Sta 114, spring 2008
Class requirements and rules
Course webpage
Sta 114, spring 2008
First digits
List of world records
Count entries starting with:
{1,2,3,4,5,6,7,8,9}
Count entries ending with:
{1,2,3,4,5,6,7,8,9}
Accounting fraud
Sta 114, spring 2008
What’s wrong with the heartland ?
Sta 114, spring 2008
It’s the emptiness
Sta 114, spring 2008
The geometry of randomness
Dido’s problem (Isoperimetry) : Among all closed level curves
of fixed length, find the one that encloses the largest
area.
A
Sta 114, spring 2008
A
The geometry of Gaussian random variables
A Gaussian distribution:
Sta 114, spring 2008
The geometry of Gaussian random variables
A draw of n Gaussian random variables is a point in an ndimensional space. How far from the origin is this point ?
x  x12  x 22  ... x n2
For n large the answer is that with very high probability

Sta 114, spring 2008
x
c
c
1

1
n
n
n
Law of large numbers or central limit theorem
The previous observation is a special case of the following
phenomena:
Given a smooth function of
n variables
x  (x1,..., x n ) the following is true


Pr f x    x f x   h  C1 exp C2 h 2 n .
x1  x 2  ... x n
A classic example : f (x) 
.
n
Sta 114, spring 2008
Geometry of real data


Sta 114, spring 2008
Digits in space
Mandarin tones
Regression -- pedestrian detection
Sta 114, spring 2008
Papageorgiou and Poggio, 1998
Daimler Chrysler
Sta 114, spring 2008
Experimental Mercedes
A fast version, integrated
with a real-time obstacle
detection system
MPEG
Sta 114, spring 2008
Constantine Papageorgiou
People classification/detection
Stuttgart
Sta 114, spring 2008
More regression: talking faces
Text-to-visual-speech (TTVS) systems:
STA 293 03, fall 2005
More regression: talking faces
• Hunter
• Its automatic
• Today show
STA 293 03, fall 2005
Conclusion
Statistics is about predictive modeling that quantifies
uncertainty
There are known knowns; there are things we know we
know. We also know there are known unknowns; that is
to say we know there are some things we do not know.
---- Donald Rumsfeld
STA 293 03, fall 2005
Related documents