Download 01Review-211

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Review of MGT 2110




Descriptive Statistics
Probability distribution
Estimation (Confidence interval)
Inference (Hypothesis testing)
Descriptive Statistics
 Numerical measures
o Mean, Median, Mode
o Variance and standard deviation
o Percentiles
o Quartiles and Interquartile-Range
o Frequency distribution (use Frequency array-function)
 Graphical Presentations
o Histogram
o Scatter Diagram (for two columns of data)
Probability Distribution
Random Variable (RV): A numerical description of the outcome of an experiment.
Discrete RV: A random variable that can take a countable set of values. For instance, if
an experiment consists of inspecting 10 laptops produced by a manufacturer, then a
random variable X can be defined as the number of defective laptops in the lot. The
possible values for X are any number from zero to 10.
Continuous RV: A random variable that can take an uncountable range of values. For
instance, if an experiment consists of measuring the amount of toothpaste in a 6 oz. tube,
then a random variable X can be defined as the amount of toothpaste in a tube. The
possible values for X could be any value between 5.8 oz. To 6.2 oz. The values within
the range are not countable.
Probability Distribution: A description of how the probabilities are distributed over the
values the random variable can assume. Probability distribution for a discrete RV is
called a discrete probability distribution. Probability distribution for a continuous RV is
called a continuous probability distribution.
Discrete probability distribution:
Example: The following data represents a summary of the grades received by students.
Determine the corresponding discrete probability distribution.
Grade
A
B
C
D
F
No. of students
12
25
20
5
3
65
Probability
0.185
0.385
0.308
0.077
0.046
1
Expected Value: The expected value of a RV is the average value of the RV if the
experiment is repeated over a long run.
Expected Value of a Discrete Random Variable: E(x) = µ =  (x f(x))
Example: For the above data, let X = grade points (A = 4, B=3, C=2, D=1, F=0).
Determine the expected (average) grade points for the class.
Grade
A
B
C
D
F
No. of
students
12
25
20
5
3
Probability
0.185
0.385
0.308
0.077
0.046
65
1
X
4
3
2
1
0
X * P(X)
0.738
1.154
0.615
0.077
0.000
E(x) =  (x P(x)) = 2.585
Continuous probability distribution:
Normal Probability Distribution: A continuous probability distribution. The normal
distribution is a symmetrical distribution with a mean,  , and a standard deviation,  .
Example
A department store has determined that its customers charge an average of $500 per
month, with a standard deviation of $80. Assume the amounts of charges are normally
distributed.
a.
What percentage of customers charges less than $340 per month?
b.
What percentage of customers charges more than $380 per month?
c.
What percentage of customers charges between $644 and $700 per month?
d.
What is least dollar amount of the top 10% of customer charges?
e.
What are the minimum and maximum of the middle 95% of customer charges?
Four Excel functions for answering the above questions
To find probabilities using normal distribution:
=NORMSDIST(z)
z must first be calculated before using this function.
Returns cumulative probability
=NORMDIST(X,,)
Returns cumulative probability for X
To find value of X, given normal probability:
=NORMSINV(probability)
Returns the Normal table value of z
Then, X may be computed using X =  + z
=NORMINV(Probability,,)
Returns the value of X for the given cumulative probability
Estimation (Confidence Interval)
Confidence Interval for population mean (
Assume a simple random sample of size n
Point Estimation:
Sample Statistic
Size
Mean
Standard deviation
± SE
S
SE = Sampling Error = t 2 .
n
Population Parameter
n
N
S


Confidence Interval =
(Always use t, use Z only if  is known)
Then, Confidence interval for   x  t 2 .
S
n
Two methods for calculating confidence interval
Method A – Using Excel TINV function
Step 1
Find t-table value using the Excel function
=TINV(p,df)
p =  (tail area for two-tailed test),
df = degrees of freedom
Step 2
Determine the sampling error (SE)
SE = t/2 S/√n
Step 3
Calculate the lower and upper limits of the
confidence interval
LL =
UL =
– SE
+ SE
Method B – Using Excel Data Analysis command
Step 1
Run Descriptive Statistics command from
Data Analysis command with Confidence
Level for mean checked
The output includes the sampling
error – the last item of the output
table, Confidence Level
Step 2
Calculate the lower and upper limits of the
confidence interval
LL =
UL =
– SE
+ SE
Example 1
A sample of 100 cans of coffee showed an average weight of 13 ounces with a standard
deviation of 0.8 ounces. Develop and interpret a 98% confidence interval for the mean
weight of coffee in the cans.
Example 2
For the Net Income as a % of equity, develop and interpret a 97% confidence interval for
the mean.
Confidence Interval for population proportion (p
Assume a simple random sample of size n
Point Estimation:
Sample Statistic
Size
Population Parameter
n
N
Mean
p
Confidence Interval for p =
± SE
Estimating Sampling Error (SE) =
z 2 .
Then, Confidence interval for p = p  z 2 .
p (1  p )
n
p(1  p)
n
Step 1
Find z-table value using the Excel function
Step 2
Determine the standard error estimate
=ZINV(/2)
.
Step 3
p (1  p )
n
Determine the sampling error (SE)
SE = z 2 .
Step 4
Calculate the lower and upper limits of the
confidence interval
LL =
UL =
p (1  p )
n
– SE
+ SE
Example
In a poll 600 voters were asked whether they were in favor of eliminating plastic bags in
grocery stores. 390 of the voters were in favor and 210 of the voters were opposed.
Develop a 92% confidence interval estimate for the proportion of all the voters who are
opposed to the proposal.
Inference (Hypothesis Testing)
Step 1: Set up the null and the alternative hypotheses.
Three types of hypotheses
Type
For population mean  For population proportion p
Two-tailed
H0: p = p0
Ho:  = a
H1: p ≠ p0
Ha:  ≠ a
One-tailed
H0: p ≤ p0
Ho:  ≤ a
H
1: p > p0
Ha:  > a
One-tailed
H0: p ≥ p0
Ho:  ≥ a
H1: p < p0
Ha:  < a
Step 2: Decision rule for testing the hypotheses
Possible results of a Hypothesis Test
H0 is accepted
Correct decision
H0 is true
Type II error
H0 is false
H0 is rejected
Type I error
Correct decision
Decision rule: Reject H0 if the probability of type I error <= , where,
= Level of significance. i.e. the maximum tolerable value for the
probability of type I error up to which the H0 can be rejected
Note: Probability of type II error = 
Step 3: Compute p-value and reject H0, if p-value <= .
Case 1: For hypotheses about , use t-distribution for p-value
p-value = TDIST(abs(t),df,k)
Where, t 
X  0
S
n
, df = degrees of freedom = n-1, and k = number of tails, 1 or 2.
Case 2: For hypotheses about p, use z-distribution for p-value
p-value = 1 - NORMSDIST(abs(z)) for one-tailed tests
p-value = 2*(1 - NORMSDIST(abs(z))) for two-tailed tests
Where, z 
p  p0
.
p0 (1  p0 )
n
Example 1:
A sample of 81 account balances of a credit company showed an average balance of
$1,200 with a standard deviation of $126. Determine if the mean of all account balances
is significantly different from $1,150. Use a .05 level of significance.
Example 2:
It is assumed that at least half the membership of a national trade union is female. A
random sample of 400 members showed 168 women. Does the sample show that the
proportion of women among the membership is less than 50%? Use a .05 level of
significance for this hypothesis test.
Example 3:
It is normally assumed that the net income as % equity for the companies in the
population is no more than 13%. However, test whether the sample data shows that the
net income as % equity for the companies in the population is now greater than 13%.
Use a .01 level of significance.
Related documents