Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Cell culture wikipedia , lookup
Cell encapsulation wikipedia , lookup
Cellular differentiation wikipedia , lookup
List of types of proteins wikipedia , lookup
Cell growth wikipedia , lookup
Tissue engineering wikipedia , lookup
Extracellular matrix wikipedia , lookup
Guidelines for Building Self‐Organizing Maps Jochen Wendel Barbara Buttenfield University of Colorado ‐ Boulder [email protected] Motivation o SOMs are complex in nature o No clear instructions on building SOM correctly o Literatures varies in recommendations for building SOM Establishing Guidelines for building SOMs What is a SOM? • Self‐Organizing Maps, Kohonen Maps (Kohonen 1982), C. von der Malsburg (1973) • Artificial Neural Network • Based on unsupervised learning • Classification and Data Reduction (Dimensionality) • GIScience, Data Mining, Biology, Computer Science, ... How does the SOM algorithm work? SOMs organize themselves by competing for assignment of observations. Cells adjust their weights with each placement by becoming more similar to cells in their immediate vicinity which have had assignments. – Create a list of observations with values on a number of variables (dimensions) – Compute the similarity among observations based on all variables – Iteratively classify on the basis of similarity ‐‐ “similarity is distance” How does the SOM algorithm work? • Every cell assigned a random weight • Place observation in the cell with the most similar weight (based on input variables) (BMU) • Update neighboring cells to weights similar to the situated cell • Place another observation • Adjust the weights Etc. • Repeat this process many times SOM (typical MatLab output) • 16x16 total • Talk about the variables • What do you see? • Umatix Case Study Data • • • • • List of 100+ GIS commands compiled Attributes describe each command (“keywords”) Commands focus on hydrology modeling Simple versus compound commands Binary matrix Case Study Data Raster Only 1 = task is raster data only Raster and Vector 1 = operates on both Vector Only Data 1 = task is a data mgt. function (copy, delete, etc.) Management Simple 1 = atomic command Compound Geometric 1 = task modifies geometry Attribute Terrain 1 = task deals with terrain Flow 1 = task deals with flow Regional 1 = task works on neighborhood Local 1 = task works on each individual pixel CSR 1 = task changes spatial relation Terrain and Flow Global Steps in Building a SOM • • • • • • • Initialization SOM size SOM shape Neighborhood geometry Training length Quantifying the error Edge Effects SOM Size Qualitative Recommendations (Ultsch and Simon, 1990) : • What is your goal with the SOM? • Small, Medium and Large SOMs Quantitative Recommendations: • Avoid empty cells • Optimal size after Vesanto (2005) SOM Size n = 108 obs msize = 5 * SQRT (108) = 51.96 cells Flow 8x8 16x16 32x32 Optimal Size after Vesanto (2005) • The optimal solution after Vesanto (2005) would suggest 165 as toatal SOM size SOM Shape • Quantitative Recommendations (Kohonen 1995): • Symmetrical vs. Non‐symmetrical shapes • Hexagonal vs. Square Grid SOM Shape Symmetrial 8x8 Non ‐ Symmetrial 12x8 16x16 24x16 32x32 36x34 SOM Shape Symmetrial 8x8 Non ‐ Symmetrial 12x8 16x16 24x16 32x32 36x34 SOM Shape Symmetrial 8x8 Non ‐ Symmetrial 12x8 16x16 24x16 32x32 36x34 SOM Shape Symmetrial 8x8 Non ‐ Symmetrial 12x8 16x16 24x16 32x32 36x34 Qunatifing the Error ‐ Umatrix • • • • Unified Distance Matrix Double the number of cells Distance measurement between attributes Cluster can be seen as metaphor for landscape • Quantisation Error Qunatifing the Error ‐ Umatrix • • • • Unified – Distance Matrix Double in Size Distance Measurment between attributes Cluster can be seen as metaphor for landscape • Quantisation Error Quantifying the Error qe = 0.468 qe = 0.1084 12x8 24x16 qe = 0.3689 qe = 0.0422 qe = 8.6026x10-5 36x34 qe = 2.4736x10-5 Guidelines (Size and Shape) • Remove redundancy in the dataset (degrees of freedom) • Vesanto (2005) helps determine the SOM size • Non‐symmetrical SOMs have less edge effects • Quantization Error is a helpful measurement to avoid overfitting the model (values very close to zero)