* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture6N
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina GIS is capable of data analysis  Attribute Data Describe with statistics  Analyze with hypothesis testing   Spatial Data Describe with maps  Analyze with spatial analysis  Describing one attribute Flat File Database Attribute Attribute Attribute Record Value Value Value Record Value Value Value Record Value Value Value Attribute Description    The extremes of an attribute are the highest and lowest values, and the range is the difference between them in the units of the attribute. A histogram is a two-dimensional plot of attribute values grouped by magnitude and the frequency of records in that group, shown as a variable-length bar. For a large number of records with random errors in their measurement, the histogram resembles a bell curve and is symmetrical about the mean. Describing a classed raster grid 20 % (blue) = 19/48 15 10 5 If the attributes are:  Numbers  statistical description  min, max, range  variance  standard deviation Statistical description Range : max-min  Central tendency : mode, median, mean  Variation : variance, standard deviation  Statistical description    Range : outliers mode, median, mean Variation : variance, standard deviation Elevation (book example) GPS Example Data: Elevation Table 6.2: Sample GPS Readings Data Extreme Date Time D M S Minimum Maximum Range 6/14/95 6/15/95 1 Day D MS 10:47am 42 30 54.8 75 41 13.8 10:47pm 42 31 03.3 75 41 20.0 12 hours 00 8.5 00 6.2 Elev 247 610 363 Mean   Statistical average Sum of the values for one attribute divided by the number of records n X = X i/n i = 1 Variance The total variance is the sum of each record with its mean subtracted and then multiplied by itself. The standard deviation is the square root of the variance divided by the number of records less one. Standard Deviation   Average difference from the mean st.dev. Sum of the mean subtracted from the value for each record, squared, divided by the number of records1, square rooted. =  (X i - X ) n-1 2 GPS Example Data: Elevation Standard Deviation   Same units as the values of the records, in this case meters. Elevation is the mean (459.2 meters)    plus or minus the expected error of 82.92 meters Elevation is most likely to lie between 376.28 meters and 542.12 meters. These limits are called the error band or margin of error. Standard Deviations and the Bell Curve One Std. Dev. below the mean Mean 542.1 459.2 376.3 One Std. Dev. above the mean Testing Means (1)    Mean elevation of 459.2 meters Standard deviation 82.92 meters What is the chance of a GPS reading of 484.5 meters? • 484.5 is 25.3 meters above the mean • 0.31 standard deviations ( Z-score) • 0.1217 of the curve lies between the mean and this value • 0.3783 beyond it Testing Means (2) Mean 12.17 % 484.5 459.2 37.83 % Accuracy Determined by testing measurements against an independent source of higher fidelity and reliability.  Must pay attention to units and significant digits.  Not to be confused with precision!  The difference is the map GIS data description answers the question: Where?  GIS data analysis answers the question: Why is it there?  GIS data description is different from statistics because the results can be placed onto a map for visual analysis.  Spatial Statistical Description For coordinates, the means and standard deviations correspond to the mean center and the standard distance  A centroid is any point chosen to represent a higher dimension geographic feature, of which the mean center is only one choice.  Spatial Statistical Description  For coordinates, data extremes define the two corners of a bounding rectangle. Geographic extremes    Southernmost point in the continental United States. Range: e.g. elevation difference; map extent Depends on projection, datum etc. Mean Center mean y mean x Centroid: mean center of a feature Mean center? Comparing spatial means Spatial Analysis Lower 48 United States  1996 Data from the U.S. Census on gender  Gender Ratio = # females per 100 males  Range is 96.4 - 114.4  What does the spatial distribution look like?  Gender Ratio by State: 1996 Searching for Spatial Pattern   A linear relation is a predictable straightline link between the values of a dependent and an independent variable. (y = a + bx) It is a simple model of correlation. A linear relation can be tested for goodness of fit with least squares methods. The coefficient of determination r-squared is a measure of the degree of fit, and the amount of variance explained. Simple linear relation best fit regression line y = a + bx observation dependent variable gradient intercept y=a+bx independent variable Testing the relation gr = 117.46 + 0.138 long. GIS and Spatial Analysis    Geographic inquiry examines the relationships between geographic features collectively to help describe and understand the real-world phenomena that the map represents. Spatial analysis compares maps, investigates variation over space, and predicts future or unknown maps. Many GIS systems have to be coaxed to generate a full set of spatial statistics. You can lie with... Maps Statistics Correlation is not causation! Terrain Analysis Paula Messina Introduction to Terrain Analysis What is terrain analysis?  How are data points interpolated to a grid?  How are topographic data sets produced from non-point data?  How are derivative data sets (i.e., slope and aspect maps) produced by ArcView?  What is Terrain Analysis?  Terrain Analysis: the study of groundsurface relief and pattern by numerical methods (a.k.a geomorphometry).  Geomorphology  qualitative  Geomorphometry = quantitative Interpolation to a Grid ? 46 58 70  86 46 58 97 70 86 97 Assumptions:   Elevations are continuously distributed The influence of one known point over an unknown point increases as distance between them decreases Interpolation Using the Neighborhood Model  Inverse-Distance theory dictates:    The value of X > 58 The value of X < 97 The value of X is closer to 58 than 97 46 58 70 x 86 97 Neighborhood Interpolation Using Inverse Distance Weighting R  Zp dp-n P=1 Zx = ArcGIS calls this IDW 46 R  dp-n 58 P=1 Zx= elevation at kernal (point x) x 86 70 Zp = elevation at known point p dp = distance from point x to point p n = “friction of distance” value; usually between 1 and 6  When n=2, the technique is called “inverse-squared distance weighting.” 97 Types of “Neighborhoods” used with IDW  Nearest n Neighbors     in this example, n = 3 this method isn’t effective when there are clusters of points “nearest in quadrants,” and “nearest in octants” searches can help 46 58 x 70 Fixed Radius   a radius is selected points are selected only if they lie within that fixed radius 86 97 46 58 x 70 86 97 Interpolation using the Spline Method  The spline interpolator fits a minimum-curvature surface through input points.   “Rubber sheet fit” The spline interpolator fits a mathematical function to a specified number of nearest points Interpolation Using Kriging  Based on regionalized variable theory  Drift, random correlated component, noise This method produces a statistically optimal surface, but it is very computationally intensive  Kriging is used frequently in soil science and geology  Trend Interpolator  Fits a mathematical function (a polynomial of specified order) to input points      Points may be chosen by nearest neighbor or radius searches --or-All points may be used Uses a least-squares regression fit The surface produced does not necessarily Not available as a menu item pass through the points used in ArcGIS This is an excellent choice when data points are sparse Which ArcView menu interpolator is better?  IDW  Assumption: The variable being mapped decreases in influence with distance • Example: interpolating consumer purchasing power for a retail site analysis  Spline   Assumption: The variable being mapped is a smooth, continuous surface; it is not particularly good for surfaces with large variability over small horizontal distances • Examples: terrain, water table heights, pollution concentration, etc. The Finished Grid 46 58 70 x 86  97 Grids are subject to the “layer cake effect” 46 48 50 52 46 46 44 48 50 54 56 46 56 54 50 52 60 64 68 80 80 56 58 65 74 86 84 80 66 69 73 80 90 88 86 70 75 78 86 94 94 80 72 76 80 84 90 89 84  The Messina “Eyeball” Interpolator was used Point Data Collection in the Field It is critical to obtain data at the corners of the grid extent  It is advisable to obtain the VIPs (Very Important Points) such as the highest and lowest elevations  Other Continuous Surface Sources  USGS DEMs       produced directly from USGS Topographic Maps Elevations of an area are averaged within the grid cell (pixel) High and low points can never be saved as a grid cell value Various techniques (i.e. stereograms) were used to accomplish this process Original datum (i.e. NAD27, NAD83) is preserved in the DEM Spatial resolution: 30m (7.5 minute data), 1 arc-second (1 degree data), 10m*, 5m* *(limited coverage) Other Continuous Surface Sources  Synthetic Aperture Radar, Sidelooking Airborne Radar  Shuttle Missions: • Shuttle Radar Topography Mission, 2/00 • SIR-C , 1994  Other Orbiters • Magellan Mapping Mission of Venus, 1990-1994 Click here to see an animation of the Venutian surface topography  Airborne Radar Mappers • AirSAR/TopSAR • GeoSAR: California mapping Click here to link to Hunter College’s Radar Mapping Web Site How is Slope Computed?  Slope = arctan [ () () dZ dX 2+ dZ 2 ] dY 100 Calculate the slope for the central pixel. Click here for the solution. 130 140 120 150 160 160 170 200 Grid cell = 100m x 100m How is Aspect Computed?  () () Aspect A’ = arctan - dZ dY dZ dX If If If dZ is negative, add 90 to A’ dX dZ is positive, and dX dZ is positive, and dX dZ is negative: add 270 to A’ dY dZ is positive: subtract A’ from 270 dY 100 130 140 120 150 160 160 170 200 Grid cell = 100m x 100m Calculate the aspect for the central pixel. Click here for the solution.
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            