Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2008 Data East company profile Company of about 85 employees Based in Akademgorodok (Novosibirsk, Russia), Founded from the “Novosibirsk Regional Center of Geoinformation Technologies of the Russian Academy of Sciences” Own products and services Services: GIS software development service Data preparation service Products: Extensions for ArcGIS Drive Time Engine Personal Internet Map Server Map Engine Well Tracking Map Engine DoubleGis products’ line: – Desktop system – PocketPC application Map Engine Atlas of Siberian Region - Navigation system for Siberian region - Data East products (CityExplorer, PersonalIMS, etc.) Personal IMS Data preparation service Partners and customers worldwide ESRI, Inc. (USA) ESRI UK GlobeXplorer, Inc. (USA) NewFields, Inc. (USA) Exponent (USA) InstallShield, Inc. (USA) Schlumberger The Crown Estate (UK) ChevronTexaco (USA) Shell Group De Beers Group USGS (USA) U.S. Army Corps of Engineers (USA) Bowater (Canada) Rotorua District Council (New Zealand) Geoscience Australia (Australia) Bristol City Council (UK) Newcastle City Council (UK) Bureau of Land Management (USA) U.S. Fish and Wildlife Service (USA) Tauw bv (Netherlands) Washington State Department of Ecology (USA) and more… Data Mining in Geoinformation Systems Data Mining Tasks: Prediction Classification Clustering Associations Discovery Sequence-based Analysis On-Line Analytical Processing (OLAP) Forecast sales for new store location Target variable – sales Properties of stores: • Size • Number of employees • Number of parking spaces Trade area attributes: • Demographic variables like income, age, educational obtainment, ethnicity • Intersections with competitors Prediction Task: 7 Steps to Glory Step 1: Preparation of datasets • The set of objects must be homogeneous • The same measurement for different objects should be measured in the same scale • The set of measurements should be complete for every object • Cannot use the target variable while calculation the values for source variables • The number of objects should be reach enough Prediction Task: 7 Steps to Glory Step 2: Calibration of variables Types of variables: • Boolean variable (multi-valued logics is allowed) • Nominal variable • Ordered nominal variable • Discrete variable • Continuous variable • Continuous variable with constraints • Continuous variable of exp-type Prediction Task: 7 Steps to Glory Step 3: Statistical Analysis • Calculate the mean value, the standard deviation for every variable • Calculate the correlation matrix Step 4: Normalization of source variables Step 5: Reduction of source variables Step 6: Thinning data and finding outliers Step 7: Constructing a predictor • Calculate the predictor with minimal complexity • Test the predictor on independent sample dataset On-Line Analytical Processing Datasets for Analysis • Fact table • Categorization of columns to be mapped to dimensions of the cube On-Line Analytical Processing Cube structure: • Measures • Dimensions categorized in hierarchies • Attributes of members Query language: • MDX • JOLAP • Specialized Spatial OLAP for ArcGIS Desktop Select a spatial dimension Spatial OLAP for ArcGIS Desktop Select a geoprocessor Spatial OLAP for ArcGIS Desktop Specify a request to OLAP provider Spatial OLAP for ArcGIS Desktop Select dimension members Spatial OLAP for ArcGIS Desktop Select attributes of feature layer Spatial OLAP for ArcGIS Desktop Splines for Data Mining under dot.net SDM Data: Core objects (vectors, vector collections) Matrices Solvers of SLAEs SDM Mining: Calibrators Core Data Mining (statistics, outlier analysis, Least Squares fitter) Transformations of variables Approximation (polynomial regression, radial basic functions) SDM Splines: Univariate polynomial splines (interpolation, smoothing, averaging) Multivariate analytic splines (interpolation, smoothing, regression, spline-collocation) Splines for Data Mining under dot.net Contact information At Data East we are always open for cooperation and new partnership! Address: Data East, LLC P.O. Box 664, Novosibirsk 630090, Russia Phone: +7 (383) 3-320-320 Fax: +7 (383) 3-325-785 E-mail: [email protected] [email protected]