Download y - statler.wvu.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forecasting wikipedia , lookup

Regression toward the mean wikipedia , lookup

Choice modelling wikipedia , lookup

Data assimilation wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
MAE244
ANALYSIS
c.1
STATISTICAL ANALYSIS
Mean and Standard Deviations
Statistical analysis is often used to explain variations in experimental data. It is the basis for
which predictions can be made from measurements (as in extrapolation). Probably the statistical
measures that are most familiar to students are the mean (or average), which is used to describe a
sample center or location, and standard deviation, which is a measure of the spread of the sample.
The mean is defined as
n
∑y
i
i
µ=
n
where y is the variable of interest for each member of the sample and n is the number of
observations in the sample. The standard deviation is the square root of the variance
Standard Deviation = s =
s2
Where
∑y
s2 =
2
i
− (∑ y i ) 2 / n
i
i
n
Example: A total of 58 AISI 1018 cold-drown steel bars were tested to determine the 0.2 percent
offset yield strength Sy in kpsi. The results were:
20
m
2
6
6
9
19
10
4
2
15
Frequency
Sy
64
68
72
76
80
84
88
92
5
0
m is the number of measurements at the
certain value.
Employ the previous equations:
∑ S × m = 78.41
∑m
∑ S × m − (∑ S × m) /(∑ m) = 42.45
=
∑m
µ=
y
2
s
2
y
s = s 2 = 6.52
10
2
y
64
68
72
76
80
84
Yield Strength Sy, kpsi
88
92
MAE244
ANALYSIS
c.2
Therefore, the yield strength of the steel equals to 78.41±6.52 kpsi
Often in laboratory experiments, students will collect data (e.g. strain) as a result of some known
stimulus (e.g. load) and will be asked to determine the relationship between x (strain) and y
(load). As an example, Young's Modulus (or the Elastic Modulus) is the linear relationship, or
slope, between stress (y) and strain (x). To find Young's modulus, student would plot stress and
strain, and then draw a line that best fits the data. The slope could then be determined by finding
the change in y over the change in x (y=mx+b from algebra). The problem with this method is
that everyone would probably draw this line differently, and there would be no unique value for E
for the experimental data, only estimates. To overcome this shortcoming, the method of least
squares will be used.
Linear Regression (Least Squares Fit)
The least squares method is used to fit a polynomial of nth degree. Because our
experiments will be conducted in the linear range of linear elastic materials, the only thing that
should be considered is the fit of a straight line. Thus, through the least squares fit, the slope, m,
and the intercept, b, of the straight line will be determined.
y = mx + b
that will be the best representation of the experimental data (x1, y1), (x2, y2), .... (xn, yn)....(xN,
yN). The least square fit will tell how changes in x affect changes in y, where x is the independent
variable and y is the dependent variable.
y variable (dependent)
y=mx+b +ε
y=mx+b
x variable (independent)
The term ε is added to define the actual location of the points (i.e. ε is an error term). For n x and
y data points, the slope, m, and the intercept, b, are calculated using the following equations:
x
y
x=∑ n , y=∑ n
2
Sxx
(∑ x )
= ∑x −
n
m=
2
Sxy
Sxx ,
, Sxy = ∑ xy −
b = y − mx
(∑ x )(∑ y )
n
MAE244
ANALYSIS
c.3
Correlation
The main use of regression is prediction. The sample correlation coefficient, r, is the
statistic to determine the strength of the correlation (or prediction). It is found using
Sxy
r=
Sxx Syy
where r=1 is a perfect positive fit and r=-1 is a perfect negative fit. r2, the coefficient of
determination, is often used to indicate the proportion of the variability in y explained by the
linear bivariate association with x.
Example. r = 0.89, therefore r2 = 0.79. Then 79% of the variability among y is explained on the
basis of the linear relationship between x.
Regression is for prediction!
Correlation is the strength of the prediction!
Statistical analysis can be performed using Excel or Lotus 1-2-3 so it is not necessary to perform
hand calculations using the above equations.
This analysis tool performs linear regression analysis by using the "least squares" method to fit a
line through a set of observations. Student can analyze how a single dependent variable is affected
by the values of one or more independent variables ¾ for example, how an athlete's performance
is affected by such factors as age, height, and weight. Student can apportion shares in the
performance measure to each of these three factors, based on a set of performance data, and then
use the results to predict the performance of a new, untested athlete.
For the following set of experimental data, regression analysis was performed using Excel.
Linear Regression Example
40
Stress
MPa
0
180
570
700
1075
1300
1600
1690
0
5
10
15
20
25
30
35
35
y = 0.0193x + 0.3153
2
R = 0.9889
30
Stress (MPa)
Strain
µ mm/mm
25
20
15
Experimental Data
10
Linear Regres sion of Data
5
0
0
500
1000
1500
Strain (µ
µ mm/mm)
2000