Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Processing Geographic Data: Common Measures Useful In Thematic Mapping SP 240 Cartography Alex Chaucer Data Processing Data Processing 1. processing geographic data before mapping in order to reduce, enhance, retain key features, or show primary spatial patterns 2. preliminary activity of data symbolization Spatial Information Technology Center The Need for Data Processing • • • • Data reduction Enhancement Retention of Key Features Simplification Allows you to convey thought in graphic form as simply as possible Spatial Information Technology Center Mathematical and Statistical Methods Mathematics Dealing with magnitudes and relationships that can be expressed as numbers and symbols Simple Ratios, Proportions, and Percentages Statistics To draw conclusions where uncertainty and variation are present To summarize observations To describe relationships between two variables To make inferences, both estimations and tests of significance Spatial Information Technology Center Ratio, Proportion, and Percent Ratio A fraction used to express the relationship between two variables Example: Population Density Proportion A special ratio that expresses the relationship between the amount in one class and the total in all classes Percentage The proportion developed by multiplying the decimal fraction by 100 Spatial Information Technology Center Variables, Values, and Arrays Variables Raw data for statistical analysis. Examples: height, rainfall, elevation Notation: X, Y Values An individual numerical observation of a variable Examples of rainfall variable: 16, 18, 25, 34 Notation: x1 x2 x3 xi Array In statistical terminology, an ordered arrangement of values Examples: ascending, descending Can be nominal, ordinal, interval Spatial Information Technology Center Frequency Distributions and Histograms Frequency Distribution An ordered array that shows the frequency of occurrence of each value Frequency Table Constructed by dividing the total range of the data into classes, adding up the cells in a class, which can then be displayed in a histogram Relative Frequency The proportion of observations of a particular class Spatial Information Technology Center Frequency Distributions and Histograms Histogram A graphic way of presenting the frequency or relative frequency of occurrence of a variable http://blacklandgrass.tamu.edu/images/latest/histogram.gif Spatial Information Technology Center Summarizing Data Distributions Often, it is better to work with predefined statistical divisions that there may already be data for. Example: Metropolitan Statistical Areas or Census Tracts from the Census Aggregated one numerical descriptor has already been calculated for observed values within the bounded area Example: The Median Family Income in Little Rock, Arkansas in a Metropolitan Statistical Area (MSA) in 1969 is $8,285 Spatial Information Technology Center Nominal Scale: The Mode and the Variation Ratio Nominal Scale Data data that results from counting, simplest data level Example: 10 hats, 5 coats, 22 pants, and 12 pairs of shoes Mode Number of the class in a nominal distribution with the highest frequency (pants is mode above) Variation Ratio The proportion of occurrences not on the modal class Modes and variation ratios are important indices of central tendency and dispersion for nominally scaled data. Spatial Information Technology Center Ordinal Scale: The Median and Percentiles Ordinal data data that provides a ranking less than, more than decisions Median that place in a ranked ordinal data set that neither exceeds nor is exceeded in rank by more than 50 percent of the observations; used to describe ordinal data Percentile the place in a ranked data set that divides the number of observations into specified a portion of all the observations Spatial Information Technology Center Interval and Ratio Scales: The Arithmetic Mean and the Standard Deviation Interval/Ratio Data data resulting from measurement most sophisticated data Interval arbitrary starting points (Fahrenheit temperature scale) Ratio have zero as a starting point (Kelvin scale) Arithmetic Mean add all the values and divide by the number of values Standard Deviation the square root of the variance; used to describe dispersion around the arithmetic mean in an ordinal/ratio data set Spatial Information Technology Center Interval and Ratio Scales: The Arithmetic Mean and the Standard Deviation Variance the statistic used most commonly to describe dispersion around the arithmetic mean in an ordinal/ratio data set Normal Distribution frequency distribution represented by a bell-shaped curve;used as a basis for comparison in many statistical measures http://www.duncanwil.co.uk/norm_files/image009.jpg Spatial Information Technology Center Interval and Ratio Scales: The Arithmetic Mean and the Standard Deviation Skewness numeric value of deviation from the normal or bell-shaped, frequency distribution Kurtosis numeric value indicating “peakiness” in a frequency distribution http://www.pqsystems.com/eline/2001/02/kurtosis.gif Spatial Information Technology Center Measures of Areal Concentration and Association Cartographers and Geographers Look for patterns of spatial variation Following will look at some of the methods for describing spatial patterns Spatial Information Technology Center The Coefficient of Areal Correspondence Coefficient of Areal Correspondence method of comparing areal spatial distributions; defined as a ratio May suggest causal relationships as values approach 1.0 Spatial Information Technology Center Areal Means Areal Mean method of calculating the spatial balance of a set of data points Over time, these centers of areas can give insight into a spatially dynamic pattern. Standard Distance measure for depicting dispersion around an areal mean 1. 2. 3. Both include all events (points) used in the analysis Because all observations are used, both are extremely sensitive to change in any one observation. Because squares of distance are used in computing the standard distance, its value is strongly affected by points at extreme distances from the areal mean. Spatial Information Technology Center Location Quotient Location Quotient a measure of geographical concentration; illustrates deviation from assumed proportional share Location quotient can show geographical localization. Can be useful by guiding further studies. compares the local economy to a reference economy, in the process attempting to identify specializations in the local economy Spatial Information Technology Center Measuring Spatial Association by Regression Methods Residuals from regression differences between observed y (or x) values and those estimated by the regression line. Idea behind regression Looking for a causal relationship between two variables, one dependent and on independent Spatial Information Technology Center Measuring Spatial Association by Regression Methods Scattergram diagram containing a plot of data points, each of which has a value in two dimensions; a graphic way to illustrate mathematical correlation; also called a scatterplot Regression line drawn on a graph to depict the relationship between two variables; linear regression is a common form Correlation method of showing the mathematical association between two or more variables Spatial Information Technology Center Data Classification Classification Scientific reduction of a large number of individual observations, events, or numbers into smaller groups to facilitate explanation aka Taxonomy Spatial Information Technology Center Classification is More Classification By grouping data generalizations can be made Can compute and average, and then classify each as being above the average or below the average Spatial Information Technology Center