Download Mean and Weighted Mean Centers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Mean field particle methods wikipedia , lookup

Taylor's law wikipedia , lookup

Law of large numbers wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Mean and Weighted Mean Centers
The mean center is a measure of the geographic
center of a set of observations.
This is analogous to statistical mean.
We will use the x,y coordinates to determine the
mean center.
• Best to use projected coordinates whose units
are meters, feet, etc…
• Geographic coordinates (lon/lat) can be used
but are harder to deal with.
The mean center is calculated using the equations:
n
X Coord 
X
i 1
n
n
i
Y
i
YCoord 
i 1
n
n  13
X Coord
YCoord
165754  159382 K ...173851

 170924
13
21808553  2176152 K ...2179391

 2173138
13
Mean Center  170924, 2173138
170924
2173138
The weighted mean center is calculated by multiplying
the coordinate by the weighting factor, using the
equations:
n
w X
i
XW 
i 1
n
w
i 1
i
n
i
w Y
i i
YW 
i 1
n
w
i 1
i
X Coord 
YCoord 
(2275)165754  (3522)159382...(1613)173851 8859431281

 173215
2275  3522...1613
51147
(2275)21808553  (3522)2176152...(1613)2179391 111105588481

2172280
2275  3522...1613
51147
Weighted Mean Center  173215, 2172280
Standard Distance
The standard distance is analogous to the standard deviation.
We first determine the mean X and Y coordinates and then
calculate the squared coordinate deviates.
More dispersed point patterns will have larger standard distances,
clustered points will have smaller standard distances.
One can also substitute the weighted mean center and
observations to calculate a weighted standard distance.
This method is sensitive to extreme observations (e.g. a point lying
far from the rest). The SD is a radius in map units around the mean
center.
First calculate the mean center:
n  18
3996648
38960860
X
 222036
Y 
 2164492
18
18
Mean Center  222036, 2164492
X  222036
Y  2164492
Xi
Yi
Xdeviate2
Ydeviate2
215058
2168211
48692484
13829308
215586
2166276
41602500
3181863
217766
2164588
18232900
9173
217872
2162759
17338896
3004059
217344
2161738
22014864
7585740
214741
2159839
53217025
21652477
219666
2165081
5616900
346659
220400
2162020
2676496
6111883
222304
2161352
71824
9860996
224309
2162618
5166529
3512709
226736
2159875
22090000
21318741
232399
2161563
107391769
8580343
229023
2163357
48818169
1288729
225997
2165608
15689521
1244960
230183
2172362
66373609
61933402
224626
2167508
6708100
9094916
222515
2166452
229441
3840729
220123
2169653
3659569
26633627
∑ 3996648
38960860
485590596
203030315
SD 
485590596 203030315

 6185.2 meters
18
18
Note that the SD is a radius around the mean center in coordinate
units (e.g. meters).
This is analogous to 1 standard deviation. Concentric rings could
be mapped to display several standard distances.
Runs Test for Sequential Nominal Data
We are interested in buildings along several streets relative to the
downtown area.
Buildings coded as black squares have had the same business
operating for over 10 years (successful).
Those buildings coded as white squares have had more than two
businesses operating there in the last 10 years (unsuccessful).
With Run A we will perform a two-tailed test.
Run A
Unlike other tests there is no equation for the runs test unless the
sample size of either group is greater than 30.
One only needs to count the number of runs (u), a run being a
series of the same nominal value when counting from one end of
the series to the other.
Run A
Test 1 (Sample A): Two-tailed Test
Ho : The distribution of shops along Elm St. is not different than random.
Ha : The distribution of shops along Elm St. is different than random.
α = 0.05, 2
n1 = 20 (white, open businesses)
n2 = 10 (black, closed businesses)
u = 13
uCritical = 9, 20
u = 13
Since 9 < 13 < 20 accept H0
The distribution of successful and unsuccessful businesses along
Elm St. is random (u13, p > 0.05).
As with all tests of randomness:
2-tailed tests allow us only to state the samples are either
random or not random.
1-tailed tests allow us to state the samples are either random
or clustered.
• Clustered: calculated value < the lower critical value.
• Random: calculated value falls between critical values.
• Uniform: calculated value > the upper critical value.
Run B
Test 1 (Sample A): One-tailed Test
Ho : The distribution of shops along Walnut St. is random.
Ha : The distribution of shops along Walnut St. is not random.
α = 0.05
n1 = 12 (white, open businesses)
n2 = 14 (black, closed businesses)
u=7
uCritical = 9, 19
u=7
Since 7 < 9 reject H0
The distribution of successful and unsuccessful businesses along
Walnut St. is clustered (u7, p < 0.05).
X2 Contingency Analysis
This technique uses contingency tables to calculate the expected
frequencies of an event based on the observed frequency
distribution.
Expected frequencies are determined from the row and column
totals from the contingency table.
H0 : Abandonment level is independent of road type.
Ha : Abandonment level is contingent upon road type.
 
2

( f ij  fˆij )2
fˆ
where
ij
ˆf  ( Rowi )(Column j )
ij
n
and df  ( Rows  1)(Columns  1)
where f̂ are the expected frequencies,
and f are the observed frequencies.
Contingency Table
Graded
Road
Dirt
Track
TOTAL
High
Abandonment
3
14
17
Moderate
Abandonment
7
9
16
Low
Abandonment
10
9
19
TOTAL
20
32
52
Graded Road
Dirt Track
TOTAL
High
3
6.54
14
10.46
17
Moderate
7
6.15
9
9.85
16
Low
10
7.31
9
11.69
19
20
32
52
TOTAL
(20)(17)
fˆ1,1 
 6.54
52
(20)(16)
fˆ1, 2 
 6.15
52
(20)(19)
fˆ1,3 
 7.31
52
(32)(17)
fˆ2,1 
 10.46
52
(32)(16)
fˆ2, 2 
 9.85
52
(32)(19)
fˆ2,3 
 11.69
52
ˆf  ( Rowi )(Column j )
ij
n
(3  6.54) 2 (7  6.15) 2 (10  7.31) 2 (14  10.46) 2 (9  9.85) 2 (9  11.69) 2
 





6.54
6.15
7.31
10.46
9.85
11.69
2
 2  1.92  0.12  0.99  1.20  0.07  0.62
 2  4.92
df  (2  1)(3  1)  2
 02.05, 2  5.991
Since 4.92  5.991 accept H 0
Village abandonment levels are not contingent upon road type
within the Colchane region of Chile and Bolivia (χ24.92, 0.10 > p >
0.05).