Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Basic statistics Usman Roshan Basic probability and stats • • • • • • • • Random variable Probability of an event Coin toss example Independent random variables Mean and variance of a random variable Correlation between random variables Probability distributions Central limit theorem Random variable • A variable normally takes on different values • Random variable has values with different probabilities • Coin toss example • Dice example • Probabilities must sum to 1 Probability of event • Sample space: set of total possible outcomes • Event space: set of outcomes of interest • Probability of an event is – (size of event space)/(size of sample space) • Counting: how many ways to pick k unique items from a set of n items? • Probability and counting • Bernoulli trials • Coin tossing example • R function: rbinom Basic stats • Independent events: coin toss example • Expected value of a random variable – – example of Bernoulli and Binomal • Variance of a random variable • Correlation coefficient (same as Pearson correlation coefficient) • Formulas: – Covariance(X,Y) = E((X-μX)(Y-μY)) – Correlation(X,Y)= Covariance(X,Y)/σXσY – Pearson correlation Probability distributions • Binomial distribution – sum of Bernoulli trials – converges to Gaussian as number of trials approaches infinity • R functions – Reference card – rbinom – Repeated executions of rbinom • Lists • for loops • sum, length • Gaussian distribution • Chi-square distribution Limit theorems • Law of large numbers: empirical mean converges to true mean as we do more trials (follows from Chebyshev’s and Markov’s inequalities) Law of large numbers E(X) P(X ³ a) £ a • Markov’s inequality Var(X) • Chebyshev’s inequality P( X - E(X) ³ a) £ 2 a • Law of large numbers: sample mean of n i.i.d. random variables Xi converges to true one in probability • Can be proved by applying Chebyshev’s inequality X Var(X) s å X= , P( X - E(X) ³ e ) £ = i n e2 2 ne 2 Limit theorems • Central limit theorem: average of sampling distribution converges to a normal distribution as we do more trials. Specifically, it is normally distributed with mean equal to the true mean μ and standard deviation equal to σ/sqrt(n) where n is number of trials and σ is true standard deviation • How is this useful? Consider modeling the mean height of NJ residents. Can we assume it is normally distributed due to Central Limit Theorem?