Download The Math Part of the Course…

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
The Math Part of
the Course…
Measures of Central Tendency
Mode: The number with the highest frequency in a dataset
Median: The middle number in a dataset
Mean: The average of the dataset
When to use each:
Mode: Good for non-numerical data and for frequent occurrences
Median: When an outlier may significantly influence the mean, use median
Mean: When data have no likely outlier, use mean
Measures of Dispersion
Range: Range of values in a dataset (describes the extremes around the typical case)
Standard deviation: Shows how much variation there is from the mean. Low standard deviation
indicates that the data points tend to be very close to the mean, whereas a high standard deviation
indicates that the data is spread out over a large range of values.
Population Standard Deviation Formula
Sample Standard Deviation Formula
Solving for population standard deviation:
Assume the dataset: 1, 8, 14, 29, 46
Step one: Solve for :
Step two: Solve for
1
8
14
29
46
19.6
19.6
19.6
19.6
19.6
-18.6
-11.6
-5.6
9.4
26.4
345.96
134.56
31.36
88.36
696.96
1297.20
Step three: Solve final equation
The Normal Distribution
Say μ = 2 and σ = 1/3 in a normal distribution.
The graph of the normal distribution is as follows:
μ = 2, σ = 1/3
The following graph represents the same information, but it has been standardized so that μ = 0 and
σ = 1:
μ = 0, σ = 1
The two graphs have different μ and σ, but have the same shape (if we tweak the axes).
The new distribution of the normal random variable Z with mean 0 and variance 1 (or standard
deviation 1) is called a standard normal distribution. Standardizing the distribution like this makes it
much easier to calculate probabilities.
Considering our example above where μ = 2, σ = 1/3, then
One-half standard deviation = σ/2 = 1/6, and
Two standard deviations = 2σ = 2/3
If we have mean μ and standard deviation σ, then
Since all the values of X falling between x1 and x2 have corresponding Z values between z1 and z2, it
means:
The area under the X curve between X = x1 and X = x2 equals:
The area under the Z curve between Z = z1 and Z = z2.
Hence, we have the following equivalent probabilities:
P(x1 < X < x2) = P(z1 < Z < z2)
So ½ s.d. to 2 s.d. to the right of μ = 2 will be represented by the area from
to
This area is graphed as follows:
μ = 2, σ = 1/3
The area above is exactly the same as the area z1 = 0.5 to z2 = 2 in the standard normal curve:
μ = 0, σ = 1
.
Finding the Area Under the Normal Curve
In the standard normal curve, the mean is 0 and the standard deviation is 1.
The green shaded area in the diagram represents the area that is within 1.45 standard deviations
from the mean. The area of this shaded portion is 0.4265 (or 42.65% of the total area under the
curve).
To get this area of 0.4265, we read down the left side of the table for the standard deviation's first 2
digits (the whole number and the first number after the decimal point, in this case 1.4), then we read
across the table for the "0.05" part (the top row represents the 2nd decimal place of the standard
deviation that we are interested in.)
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279
We have:
(left column) 1.4 + (top row) 0.05 = 1.45 standard deviations
The area represented by 1.45 standard deviations to the right of the mean is shaded in green in the
standard normal curve above.
You can see how to find the value of 0.4265 in the full z-table below. Follow the "1.4" row across
and the "0.05" column down until they meet at 0.4265.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3304 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
3.5 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998
3.6 0.4998 0.4998 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
3.7 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
Find the area under the standard normal curve for the following, using the z-table. Sketch each one.
(a) between z = 0 and z = 0.78
(b) between z = -0.56 and z = 0
(c) between z = -0.43 and z = 0.78
(d) between z = 0.44 and z = 1.50
(e) to the right of z = -1.33.
(a) 0.2823
(b) 0.2123
(c) 0.1664 + 0.2823 = 0.4487
(d) 0.4332 - 0.1700 = 0.2632
(e) 0.4082 + 0.5 = 0.9082
It was found that the mean length of 100 parts produced by a lathe was 20.05 mm with a standard
deviation of 0.02 mm. Find the probability that a part selected at random would have a length
(a) between 20.03 mm and 20.08 mm
(b) between 20.06 mm and 20.07 mm
(c) less than 20.01 mm
X = length of part
(a) 20.03 is 1 standard deviation below the mean;
20.08 is
standard deviations above the mean
P(20.03<X<20.08)
=P(-1<Z<1.5)
=.3413+.4332
=.7745
So the probability is 0.7745.
(b) 20.06 is 0.5 standard deviations above the mean;
20.07 is 1 standard deviation above the mean
P(20.06<X<20.07)
=P(.5<Z<1)
=.3413-.1915
=.1498
So the probability is 0.1498.
(c) 20.01 is 2 s.d. below the mean.
P(X<20.07)
=P(Z<-2)
=.5-.4792
=.0228
So the probability is 0.0228.
A company pays its employees an average wage of $3.25 an hour with a standard deviation of 60
cents. If the wages are approximately normally distributed, determine
a. the proportion of the workers getting wages between $2.75 and $3.69 an hour;
b. the minimum wage of the highest 5%.
X = wage
(a)
P(2.75<X<3.69) = P(-.833<Z<.7333)
=.298 + .268
=.566
So about 56.6% of the workers have wages between $2.75 and $3.69 an hour.
(b) W = minimum wage of highest 5%
x = 1.645 (from table)
X-3.25=.987
X=4.237
So the minimum wage of the top 5% of salaries is $4.24.
The average life of a certain type of motor is 10 years, with a standard deviation of 2 years. If the
manufacturer is willing to replace only 3% of the motors that fail, how long a guarantee should he
offer? Assume that the lives of the motors follow a normal distribution.
X = life of motor
x = guarantee period
Normal Curve: μ = 10, σ = 2
We need to find the value (in years) that will give us the bottom 3% of the distribution. These are
the motors that we are willing to replace under the guarantee.
P(X < x) = 0.03
The area that we can find from the z-table is
0.5 - 0.03 = 0.47
The corresponding z-score is z = -1.88.
Since
, we can write:
Solving this gives x = 6.24.
So the guarantee period should be 6.24 years.
Measures of Association
Monkey
Favorability
Rating
Age Group
12-24
6
9
8
<12
4
8
20
Low
Medium
High
>24
18
9
3
Lambda:
An asymmetrical measure of association: the value varies depending on which variable is
independent.
Ranges from 0 to 1
Formula:
1. Calculate Row and Column Totals
Monkey
Favorability
Rating
Low
Medium
High
<12
4
8
20
32
Age Group
12-24
6
9
8
23
>24
18
9
3
30
28
26
31
85
2. Calculate E1: Find the mode of the dependent variable (the attribute that occurs the most
often) and subtract it from N (sample size). E1=N-ƒ of the mode
E1=85-31=54
3. Calculate E2: Find the mode in each column (i.e., category of the independent variable).
Subtract each value from the column (category) total and add them together. E2=(Column
total – Column mode) + (Column total – Column mode) for all attributes of the
independent variable.
E2=(32-20)+(23-9)+(30-18)=12+14+12=38
4. Find lambda.
We know that thirty percent of the errors in predicting the relationship between age and monkey
favorability can be reduced by taking into account the voter’s age.
Gamma:
•
•
•
•
•
A measure of association using ordinal variables
It is a symmetrical measure, therefore you don’t need to specify the IV and DV.
Compares pairs of observations that are positive (going in the same direction) and negative
(going in the opposite direction).
Ranges from 0 to 1
Formula:
•
Ns=Count of Same order pairs (positive); Nd= Count of inverse order pairs (negative)
Monkey
Favorability
Rating
Low
Medium
High
<12
4
8
20
Age Group
12-24
6
9
8
>24
18
9
3
To find Ns: Multiply top left cell frequency by the sum of all cells that are lower and to the
right of that cell.
Ns= 4(9+8+9+3) + 8(8+3) + 6(9+3) + 9(3)
Ns= 116 + 88 + 72 + 27 = 313
To find Nd: Multiply top right cell frequency by the sum of all cells that are lower and to the
left of that cell.
Nd= 18(9+8+8+20) + 9(8+20) + 6(8+20) + 9(20)
Nd= 810 + 252 + 168 + 180 = 1410
Interpret: Using age to predict monkey favorability results in a proportional reduction of error of
65%. There is an inverse or negative relationship: as age increases, favorability of monkeys
decreases.
Chi-Square:
Chi-square is a statistical test commonly used to compare observed data with data we would expect
to obtain according to a specific hypothesis. For example, if, according to Mendel's laws, you
expected 10 of 20 offspring from a cross to be male and the actual observed number was 8 males,
then you might want to know about the "goodness to fit" between the observed and expected. Were
the deviations (differences between observed and expected) the result of chance, or were they due to
other factors. How much deviation can occur before you, the investigator, must conclude that
something other than chance is at work, causing the observed to differ from the expected. The chisquare test is always testing what scientists call the null hypothesis, which states that there is no
significant difference between the expected and observed result.
Monkey
Favorability
Rating
Age Group
12-24
6
9
8
23
<12
4
8
20
32
Low
Medium
High
>24
18
9
3
30
28
26
31
85
Hypotheses: H0: Age and favorability are independent; H1: Age and favorability are related
First step: Calculate the expected values of each cell.
Our null hypothesis would be that age has no bearing on favorability of monkeys. As a result, the
null hypothesis would expect that favorability within each age group would be equal. To calculate
the expected value of a cell:
Monkey
Low
Favorability
Medium
Rating
High
<12
4
(10.54)
8
(9.79)
20
(11.67)
32
Age Group
12-24
6
(7.58)
9
(7.04)
8
(8.39)
23
>24
18
(9.88)
9
(9.18)
3
(10.94)
30
28
26
31
85
Second step: Calculate the chi-square calculated value.
Formula:
=
+
+
+
+
+
+
+
+
Third step: Determine the critical value
df
1
2
3
4
5
6
Significance Level
.025
.10
.05
.01
.005
2.7055
3.8415
5.0239
6.6349
7.8794
4.6052
6.2514
7.7794
9.2363
10.6446
5.9915
7.8147
9.4877
11.0705
12.5916
7.3778
9.3484
11.1433
12.8325
14.4494
9.2104
11.3449
13.2767
15.0863
16.8119
10.5965
12.8381
14.8602
16.7496
18.5475
To use this table, we need to first determine our level of significance. For the purposes of this class,
let’s always work on the assumption that we want 95% confidence (
). Next, we need to
figure out our degrees of freedom (df).
As a result, our critical value for .05 at df = 4 is 9.4877.
Fourth step: Compare the calculated chi-square value with the critical value.
Chi-square calculated: 23.66; chi-square critical: 9.49
As a result, we REJECT the null. We can conclude that monkey favorability and age are related in
some way.
Two Sample T-Test
Purpose: To compare responses from two groups. These two groups can come from different
experimental treatments, or different natural "populations".
Assumptions:
each group is considered to be a sample from a distinct population
the responses in each group are independent of those in the other group
the distributions of the variable of interest are normal
In a test of the hypothesis that females smile at others more than males, females and males were
videotaped while interacting and the number of smiles emitted was recorded. Using the following
number of smiles in the 5-minute interaction, test the null hypothesis that there are no gender
differences between the number of smiles.
Males
8
11
13
4
2
Females
15
19
13
11
18
Step One: Calculate the Means of Each Group
Step Two: Solve for the Variances of the Two Samples
8
11
13
4
2
7.6
7.6
7.6
7.6
7.6
Step Three: Solve for t
.4
3.4
5.4
-3.6
-5.6
.16
11.56
29.16
12.96
31.36
85.2
21.3
15
19
13
11
18
15.2
15.2
15.2
15.2
15.2
-.2
3.8
-2.2
-4.2
2.8
.04
14.44
4.84
17.64
7.84
44.8
11.2
Step Four: Compare Calculated t-value with Critical t-value
To determine the critical t-value, we first need to determine the degrees of freedom (df). With ttests, df = n1+n2+-2.
df = 5+5-2 = 8
At 95% confidence (
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
50%
1.000
0.816
0.765
0.741
0.727
0.718
0.711
0.706
0.703
0.700
0.697
0.695
0.694
0.692
0.691
0.690
60%
1.376
1.061
0.978
0.941
0.920
0.906
0.896
0.889
0.883
0.879
0.876
0.873
0.870
0.868
0.866
0.865
), the critical t-value is consequently 2.306.
70%
1.963
1.386
1.250
1.190
1.156
1.134
1.119
1.108
1.100
1.093
1.088
1.083
1.079
1.076
1.074
1.071
80%
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
90%
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
95%
12.71
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
98%
31.82
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
99%
63.66
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
99.5%
127.3
14.09
7.453
5.598
4.773
4.317
4.029
3.833
3.690
3.581
3.497
3.428
3.372
3.326
3.286
3.252
99.8%
318.3
22.33
10.21
7.173
5.893
5.208
4.785
4.501
4.297
4.144
4.025
3.930
3.852
3.787
3.733
3.686
99.9%
636.6
31.60
12.92
8.610
6.869
5.959
5.408
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
t-score calculated: 2.98; t-score critical: 2.306
As a result, we REJECT the null. We can conclude that gender and smiling are related in some way.
Regression
Regression is a tool for describing how, how strongly, and under what conditions an independent
and dependent variable are associated.
It can be used to make causal inferences.
The ordinary least squares regression formula is Y = a + bX and describes the slope of a line:
– Y = dependent variable
– a = y-intercept (or constant)
– b = slope or coefficient
– X = independent variable
If b is positive, the relationship is positive; if b is negative, the relationship is negative.
Interpreting Regression
Data are gathered on 40 countries to study variations in birth rate. Consider this equation:
Y = 32-.0018X
r = - .78
Seb = .00024
Where: Y = birth rate per 1000 population and X = per capita income
Identify the following: independent and dependent variables; regression coefficient; the constant; the
correlation coefficient; the coefficient of determination; the standard error of the slope.
IV: Per capita income
DV: Birth rate per 1000 population
Regression coefficient: -.0018 (for every drop of 1 in per capita income, we see an increase
of .0018 in birth rate per 1000 population)
Constant: 32 (the predicted value of Y would be 32 if X=0)
Correlation coefficient: -.78 (there is a strong, negative relationship)
Coefficient of determination: .6084 (-.78*-.78)
Standard error of the slope: .00024
What percent variation in birth rate is associated with per capita income?
6.084 (r2=-.78*-.78)
What is the direction of the relationship?
Negative
Calculate the t-ratio. What does this tell you?
It allows us to test the hypothesis that b=0. df = 38 (n-2).
The critical t-value at 95% confidence and df = 38 is 2.024.
As a result, we REJECT the null. We can conclude that gender and smiling are related in
some way.
A country has a per capita income of $2000. Estimate its birth rate.
Y = 32-.0018X
Y= 32-.0018(2000)
Y= 32-3.6
Y= 28.4
28.4 births per 1000 population
Interpreting Multiple Regression
Regression
Model Summary
Std. Error of the
Model
1
R
R Square
.638a
.407
Adjusted R Square
.403
Estimate
19.469
a. Predictors: (Constant), ZZ11. PRE IWR OBS: R gender, Y6. Employment status, J1.
Party ID: Does R think of self as Dem, Rep, Ind or what, Y1x. Age of Respondent, Y3.
Highest grade of school or year of college R completed, C5ax. SUMMARY: R better/worse
off than 1 year ago, F1ax. SUMMARY: economy better worse in last year, Y21a. Household
income
R-Square is the proportion of variance in the dependent variable which can be predicted from the
independent variables. This value indicates that 41% of the variance in the dependent variable can
be predicted from the independent variables. Note that this is an overall measure of the strength of
association, and does not reflect the extent to which any particular independent variable is
associated with the dependent variable.
b
ANOVA
Model
1
Sum of Squares
Df
Mean Square
Regression
352041.587
8
44005.198
Residual
513212.737
1354
379.035
Total
865254.324
1362
F
116.098
Sig.
.000a
a. Predictors: (Constant), ZZ11. PRE IWR OBS: R gender, Y6. Employment status, J1. Party ID: Does R think of self as Dem,
Rep, Ind or what, Y1x. Age of Respondent, Y3. Highest grade of school or year of college R completed, C5ax. SUMMARY: R
better/worse off than 1 year ago, F1ax. SUMMARY: economy better worse in last year, Y21a. Household income
b. Dependent Variable: B1j. Feeling Thermometer: Republican Party
The F Value is the Mean Square Regression divided by the Mean Square Residual, yielding F. The p
value associated with this F value is very small (0.0000). These values are used to answer the
question "Do the independent variables reliably predict the dependent variable?". The p value is
compared to your alpha level (typically 0.05) and, if smaller, you can conclude "Yes, the independent
variables reliably predict the dependent variable". You could say that the group of independent
variables can be used to reliably predict the dependent variable. If the p value were greater than
0.05, you would say that the group of independent variables do not show a significant relationship
with the dependent variable, or that the group of independent variables do not reliably predict the
dependent variable. Note that this is an overall significance test assessing whether the group of
independent variables when used together reliably predict the dependent variable, and does not
address the ability of any of the particular independent variables to predict the dependent variables.
The ability of each individual independent variable to predict the dependent variable is addressed in
the table below where each of the individual variables are listed.
a
Coefficients
Standardized
Unstandardized Coefficients
Model
1
B
(Constant)
C5ax. SUMMARY: R better/worse off
Coefficients
Std. Error
Beta
62.215
3.569
.418
.432
3.763
t
Sig.
17.430
.000
.021
.966
.334
.743
.113
5.062
.000
7.393
.271
.601
27.269
.000
.087
.034
.054
2.546
.011
-.632
.243
-.062
-2.601
.009
-1.772
2.398
-.016
-.739
.460
.018
.106
.004
.169
.865
-2.877
1.072
-.057
-2.684
.007
than 1 year ago
F1ax. SUMMARY: economy better
worse in last year
J1. Party ID: Does R think of self as
Dem, Rep, Ind or what
Y1x. Age of Respondent
Y3. Highest grade of school or year
of college R completed
Y6. Employment status
Y21a. Household income
ZZ11. PRE IWR OBS: R gender
a. Dependent Variable: B1j. Feeling Thermometer: Republican Party
Feeling thermometer Republican Party = 62.215 + .418Better/Worse Off + 3.763 Economy +
7.393 PartyID + .087 Age - .632 Education – 1.772 Unemployed + .018 Income – 2.877 Gender
(B) These estimates tell you about the relationship between the independent variables and the
dependent variable. These estimates tell the amount of increase in Feeling Thermometer Republican
that would be predicted by a 1 unit increase in the predictor.
(b) These are the values for a regression equation if all of the variables are standardized to have a
mean of zero and a standard deviation of one. Because the standardized variables are all expressed
in the same units, the magnitudes of the standardized coefficients indicate which variables have the
greatest effects on the predicted value. This is not necessarily true of the unstandardized
coefficients. Because the magnitudes of the unstandardized coefficients can largely depend on the
units of the variables, the effects of the variable on the prediction can be difficult to gauge. While
the standardized coefficients may vary significantly from the unstandardized coefficients in
magnitude, the sign (positive or negative) of the coefficients is unchanged.
These columns provide the t value and 2 tailed p value used in testing the null hypothesis that the
coefficient is 0. Coefficients having p values less than alpha are significant. For example, if you
chose alpha to be 0.05, coefficients having a p value of 0.05 or less would be statistically significant
(i.e., you can reject the null hypothesis and say that the coefficient is significantly different from 0).