Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Python for Geospatial Arthur Lembo, Matthew Bucklew, Jessica Molnar, Joshua Young Department of Geography and Geoscience © Arthur J. Lembo, Jr. Salisbury University Overview • Python Language (8:30 – 9:45) – – – – Variables Structures Statements Program control • Python Packages (9:45 – 10:45) – Google – Microsoft – Esri • Arcpy Examples (10:45 – 11:45) • Student Presentations (11:45 – 12:00) © Arthur J. Lembo, Jr. Salisbury University Purpose • Introduction to Python for geospatial analysis • Basic Python • Spatial and database tools – qGIS – Arcpy • Special Python packages for geospatial – – – – – Postgres/PostGIS; SQLite Microsoft Excel Geocoding Charting Arthur J. Lembo, Jr. Manifold GIS ©Salisbury University At the end of this workshop you will • Understand the benefits of scripting geospatial solutions using Python. • Understand the basics of the Python language such as variables, flow control, data structures, and reading/writing data. • Know how to utilize external Python packages for geospatial analysis. • Know how to integrate multiple packages together into a cohesive solution. © Arthur J. Lembo, Jr. Salisbury University What is Python Python is an interpreted, interactive, object-oriented programming language. It incorporates modules, exceptions, dynamic typing, very high level dynamic data types, and classes. Python combines remarkable power with very clear syntax. It has interfaces to many system calls and libraries, as well as to various window systems, and is extensible in C or C++. It is also usable as an extension language for applications that need a programmable interface. Finally, Python is portable: it runs on many Unix variants, on the Mac, and on Windows 2000 and later. – www.python.org • A Tour of Python • How to find what you want – How do I run a for loop? – How do I read from a file? © Arthur J. Lembo, Jr. Salisbury University The many flavors of Python • Versions – 2.6, 2.7, 2.8 – 3.0, 3.3, 3.6 • Interpreters – IDLE – WinPy – Anaconda • The Locations – and why it is important! – Python directories – Arcpy directory © Arthur J. Lembo, Jr. Salisbury University Starting with IDLE and Winpy • Working with the shell • Working with files © Arthur J. Lembo, Jr. Salisbury University The Python Language • • • • • • Python as a calculator Variables and data types Structures Statements Expressions Methods and Functions © Arthur J. Lembo, Jr. Salisbury University Python as a calculator • Just enter some numbers © Arthur J. Lembo, Jr. Salisbury University JM Quiz • What is the square root of 7 x 8? • What is 12 raised to the power of 4? • What is 5 factorial? © Arthur J. Lembo, Jr. Salisbury University Variables and Data Types • A variable is a name for a value. The computer stores the value in memory. Depending upon what the variable type is, certain operations can be done with the variable. • Some variable types include: – Strings – Numbers • Integer • Floating Point – Lists © Arthur J. Lembo, Jr. Salisbury University Variables and methods • String len, split, find replace,left/right([:],[-:]) • Number floor,round, / vs. // vs. float, round • List append, len, max, min, sort © Arthur J. Lembo, Jr. Salisbury University Structures - Lists a = [66.25, 333, 333, 1, 1234.5] a[1] print a.reverse a.insert(2,1.00) a.remove(1.0) a.pop(2) sortlist = a; sortlist.sort(); sortlist © Arthur J. Lembo, Jr. Salisbury University MB Quiz • How many letters are in supercalifragilisticexpialidocious? • What is the 9th letter in the word? • Use split to create a list of the Beatles: ‘John’,’George’,’Paul’,’Ringo’ • Create two variables: – Firstname = ‘John’ – Lastname = ‘Lennon’ • Append the two names into a new variable called FullName © Arthur J. Lembo, Jr. Salisbury University Python as a module • Add code • Save • Run the module array = [12, 9, 17, 16] sumtotal = 0 for i in array: sumtotal = i + sumtotal print sumtotal print sum(array) • Quiz – Instead of the sum, calculate the sum of the squares (use the pow function) – Print out the sum of the squares and the average sum of the squares © Arthur J. Lembo, Jr. Salisbury University Program Design Code Writing Testing Error Correction Logic Correction © Arthur J. Lembo, Jr. Salisbury University Algorithms (Program Design) • Most important part of your program • Practice algorithm as a group – Inputs – Uses – Outputs – Uses © Arthur J. Lembo, Jr. Salisbury University Algorithm - a finite list of well-defined instructions for accomplishing some task • Algorithms are essential to the way computers process information • Algorithms must be rigorously defined: – specified in the way it applies in all possible circumstances that could arise. – conditional steps must be systematically dealt with, case-by-case; – the criteria for each case must be clear (and computable). array = [2,4,6,3] largest=0 for i in array: if i>largest: largest=i print largest Have variable for - largest Look up in Help: - list - for loop - if..then..else © Arthur J. Lembo, Jr. Salisbury University JM Quiz - Temperature program Now use the input statement © Arthur J. Lembo, Jr. Salisbury University The Python Language © Arthur J. Lembo, Jr. Salisbury University Lists • Python supports arrays in the form of lists. These include a basic list: mylist = ["art","bob","sue"] mylist[0] • or, lists within lists: mylist = [["art",52],["bob",19],["Sue",44]] mylist[0] mylist[0][0] • And, we can do things with lists: sort, remove, append…. © Arthur J. Lembo, Jr. Salisbury University Dictionaries Dict = {'Name': 'Art', 'Age': 52, 'Area': 69 * 44} print dict['Name'] © Arthur J. Lembo, Jr. Salisbury University states = { 'Oregon': 'OR', 'Florida': 'FL', 'California': 'CA', 'New York': 'NY', 'Michigan': 'MI' } cities = { 'CA': 'San Francisco', 'MI': 'Detroit', 'FL': 'Jacksonville' } cities['NY'] = 'New York' cities['OR'] = 'Portland‘ print "NY State has: ", cities['NY'] © Arthur J. Lembo, Jr. Salisbury University Methods and Functions Files – Read a file – Write a file f = open('c:/tugis/python/addresses.csv') for line in f: print line f = open('c:/tugis/python/addresses.csv') o = open('c:/tugis/python/addout.csv','w') for line in f: o.write(line) o.close() © Arthur J. Lembo, Jr. Salisbury University Quiz • Read the file ‘benchmarks.csv’ • Write the file out to ‘benchmarksnew.csv’ © Arthur J. Lembo, Jr. Salisbury University Statements and Control • Change variables (str, int, float) • Work with external data(import, export) • Program Control – if – while – for © Arthur J. Lembo, Jr. Salisbury University Python modules (packages) • A module is a file containing Python definitions and statements. import math math.cos(34) • There are many, many “standard” modules built in Python by default • There are many, many, many, MANY other packages you can install © Arthur J. Lembo, Jr. Salisbury University Default modules • Modules for Python 2.7 – time – math – HTMLParser – urllib – sqlite3 © Arthur J. Lembo, Jr. Salisbury University SQLite Example (not spatial, yet) import sqlite3 conn = sqlite3.Connection('c:/tugis/spatialite/sql.sqlite') answer = conn.execute('SELECT * FROM parcels LIMIT 3') for row in answer: print row Quiz • Find the average ASMT value for each Propclass in parcels © Arthur J. Lembo, Jr. Salisbury University pip – the magic of Python • Run pip to install third party modules: pip install numpy pip install googlemaps pip install pygal pip install geocoder pip is a package management system used to install and manage software packages written in Python. Many packages can be found in the Python Package Index (PyPI). Python 2.7.9 and later (on the python2 series), and Python 3.4 and later include pip (pip3 for Python 3) by default. © Arthur J. Lembo, Jr. Salisbury University Those dreaded Python locations python -m pip install psycopg2 --target=C:\Python27\ArcGIS10.3\Lib\site-packages © Arthur J. Lembo, Jr. Salisbury University Now for the spatial stuff Geocoding: geopy, geocoder, censusgeocoder, Google Projections: Vincenty Spatial operations: shapely Spatial database: sqlite3 © Arthur J. Lembo, Jr. Salisbury University Geocoding import requests.packages.urllib3 requests.packages.urllib3.disable_warnings() import geocoder f = open('c:/tugis/python/addresses.csv','r') googleout = open('c:/tugis/python/googleoutaddress.csv','w') googleout.write("\"address\",\"lon\",\"lat\",\"matchcode\" \n") for line in f: gc = geocoder.google(line) googleout.write(gc.address.replace(',','') + ' , ' + str(gc.lng) + ',' + str(gc.lat) + ',' + gc.accuracy + '\n') print line," has an accuracy of ", gc.accuracy googleout.close() f.close() © Arthur J. Lembo, Jr. Salisbury University Benchmark Accuracies import geocoder, csv f = open('c:/tugis/python/benchmarks.csv', 'r') newf = csv.reader(f) i=0 for line in newf: if i > 0: gc = geocoder.google([line[1],line[0]],method='elevation') diff = gc.elevation - float(line[6]) print str(gc.elevation) + ' - ' + str(line[6]) + ' = ' + str(diff) i = i + 1 © Arthur J. Lembo, Jr. Salisbury University Vincenty import csv, vincenty f = open('c:/tugis/python/benchmarks.csv', 'r') newf = csv.reader(f) newyork = (40.7791472, -73.9680804) i=0 for line in newf: if i > 0: mypoint = (float(line[1]),float(line[0])) vincenty.vincenty(mypoint, newyork, miles=True) i=i+1 f.close() © Arthur J. Lembo, Jr. Salisbury University SpatiaLite Example © Arthur J. Lembo, Jr. Salisbury University SpatiaLite Setup • You can grab the SpatiaLite .dlls from here, or use the accompanying .zip file from the training packet: – http://www.gaia-gis.it/gaia-sins/ • Make sure to choose 32 or 64 bit, whatever is appropriate • Place the .dll in a directory you can reference in your code © Arthur J. Lembo, Jr. Salisbury University Spatialite Setup • If you can set the PATH on your system, add the directory that you installed the .dlls: – PATH = %PATH%;c:\tugis\python\resource s(32bit)\ • Or, in the DOS prompt, set the path (like above) and then run Python from that DOS prompt – C:\Python27\Lib\idlelib\idle.bat © Arthur J. Lembo, Jr. Salisbury University #this script connects to spatialite using the 32-bit spatialite.dll import sqlite3 con = sqlite3.connect('c:/tugis/spatialite/sql.sqlite') con.enable_load_extension(True) con.load_extension('spatialite.dll') a = con.execute('SELECT parks.geometry, parks.* from parks, parcels ' \ 'where st_intersects(parks.geometry,parcels.geometry) limit 5') for row in a: print row[3] © Arthur J. Lembo, Jr. Salisbury University MB Quiz • Find the sum of the assessment (ASMT) for all the parcels that intersect the X zone in the firm layer. • Now, do the same thing, but GROUP BY the propclass value. • Now find the ASMT total for each zone. © Arthur J. Lembo, Jr. Salisbury University Microsoft Excel © Arthur J. Lembo, Jr. Salisbury University Quick and dirty Excel pip install pypiwin32 import win32com.client excel = win32com.client.Dispatch("Excel.Application") mylist = [10,20,30,40,50,60,70,80,90] myAvg= excel.WorksheetFunction.Average([mylist]) print myAvg © Arthur J. Lembo, Jr. Salisbury University Excel quiz • What is the standard deviation of the list? • Create another list called mylist 2: [16,22,30,45,58,69,77,88,99] • What is the Pearson correlation coefficient? © Arthur J. Lembo, Jr. Salisbury University # Here we are importing the APIs. Specifically, we will need # sqlite3 to talk to sqlite, win32com to talk to Excel # and numpy to do some mathematical operations import sqlite3, os, time, win32com.client, numpy as np # here we make the connection to Excel.... excel = win32com.client.Dispatch("Excel.Application") con = sqlite3.Connection('c:/tugis/spatialite/sql.sqlite') rows = con.execute('SELECT ASMT FROM parcels limit 50') rows.fetchall newrow = [] # Just a text to print out the rows #for row in rows: # print "The value is: ", row[0] # newrow.append([row[0]]) # Here we will issue an Excel function to determine the standard deviation from the # selected values #thestdev = excel.WorksheetFunction.StDev(newrow) #print thestdev # Now we will add the confidence interval, as calculated by Excel #conf = excel.WorksheetFunction.Confidence(.45,thestdev,len(newrow)) #print 'Average: ' + str(np.average(newrow)) + ' Confidence Interval: ' + str(conf) #print' ' #print ' ' #print str(np.average(newrow) - conf) + ' ' + str(np.average(newrow)) + ' ' + str(np.average(newrow) + conf) con.close() © Arthur J. Lembo, Jr. Salisbury University Arcpy Examples © Arthur J. Lembo, Jr. Salisbury University Working with Arcpy in ArcGIS • Fire up ArcMap • Click Python window • Start typing code –Look up help files – they are excellent! © Arthur J. Lembo, Jr. Salisbury University Getting Pywin32 to work Copy C:\Python27\ArcGIS10.3\Lib\sitepackages\Desktop10.3 to C:\Python27\ArcGIS10.3\Lib\sitepackages\Desktop10.3 © Arthur J. Lembo, Jr. Salisbury University Buffer parcels and critical area import arcpy arcpy.env.overwriteOutput = True arcpy.env.workspace ="c:/tugis/python/Denton.gdb" catemp = arcpy.MakeFeatureLayer_management("CA") testvar = input("How far do you want the buffer") cabuffer = arcpy.Buffer_analysis(catemp,"cabuffer",testvar) caclip = arcpy.Clip_analysis("parcels",cabuffer,"parclip") arcpy.Delete_management(cabuffer) #arcpy.Delete_management(caclip) © Arthur J. Lembo, Jr. Salisbury University MB Total value of land in the critical area by land use import arcpy import numpy arcpy.env.workspace ="C:/tugis/python/Denton.gdb" parceltemp = arcpy.MakeFeatureLayer_management("parcels") catemp = arcpy.MakeFeatureLayer_management("CA") LUs = [row[0] for row in arcpy.da.SearchCursor(parceltemp,"LU")] LUs = list(set(LUs)) for lu in LUs: parsel = arcpy.SelectLayerByLocation_management(parceltemp,"INTERSECT",catemp) parsubsel = [row[0] for row in arcpy.da.FeatureClassToNumPyArray(parsel,"NFMTTLVL", "LU = '" + lu + "'")] totalvalue = numpy.sum(parsubsel) avgvalue = numpy.average(parsubsel) stdvalue = numpy.std(parsubsel) print "total value of " + lu + "= " + str(totalvalue) #+ " and the CV is " + str(stdvalue / avgvalue) Move the parsel out of the loop © Arthur J. Lembo, Jr. Salisbury University Total value…but easier import arcpy import numpy arcpy.env.workspace = "C:/tugis/python/Denton.gdb" parceltemp = arcpy.MakeFeatureLayer_management("parcels") catemp = arcpy.MakeFeatureLayer_management("CA") parsel = arcpy.SelectLayerByLocation_management(parceltemp,"INTERSECT",catemp) parsummarize = arcpy.Statistics_analysis(parsel,None,[["NFMTTLVL","SUM"]],"LU") for row in arcpy.SearchCursor(parsummarize): print row.LU + " $" + str(row.SUM_NFMTTLVL) © Arthur J. Lembo, Jr. Salisbury University Arcpy Potpourri import arcpy, numpy, pygal arcpy.env.workspace = "c:/tugis/python/denton/denton.gdb" arcpy.env.overwriteOutput = True acctnum = '0603012174' ##acctnum = raw_input('Enter Account Number: ') parcels = arcpy.MakeFeatureLayer_management("parcels", "parcels_lyr") soils = arcpy.MakeFeatureLayer_management("soils", "soils_lyr") selpar = arcpy.SelectLayerByAttribute_management(parcels,"NEW_SELECTION","ACCTID = '" + acctnum +"'") clip_soil = arcpy.Clip_analysis(soils,selpar,"clipsoil") soil_stats = arcpy.Statistics_analysis(clip_soil,"soil_stats",[["Shape_Area","SUM"]],["MUSYM"]) mycur = arcpy.SearchCursor(soil_stats) pie_chart = pygal.Pie() pie_chart.title = 'Soil Amount' for row in mycur: pie_chart.add(row.MUSYM,row.SUM_Shape_Area) pie_chart.render_to_file('c:/tugis/python/' + acctnum + '.svg') © Arthur J. Lembo, Jr. Salisbury University Fuzzy Neuro-logic in Arcpy # Import the necessary environment. This includes # arcpy, maybe the os or sys, or numpy for mathematics # import arcpy import numpy as np import win32com.client excel = win32com.client.Dispatch("Excel.Application") # Set up the workspace you are working in. That # gets us to the data. Here we get the directory # and also the feature data set arcpy.env.workspace = "c:/tugis/python/denton.gdb" parcels = arcpy.MakeFeatureLayer_management("parcels") ca = arcpy.MakeFeatureLayer_management("ca") # # 90% of the time we will be doing some kind of selection. # Here we are going to use the SearchCursor to return some # features based on an attribute query # rows = arcpy.SearchCursor(parcels,"'LU' = 'Residential'") # # Once we get a result from a query, we have to get into a # an array in order to manipulate it. So, we'll create an # empty array and then fill it with a field value # A = [] for row in rows: A.append(row.NFMLNDVL) # # In this case, we are going to pass the data to Excel in order # to determine a confidence interval. This allows us access to a # statistical function offered by Excel that ArcGIS does not have # thestdev = excel.WorksheetFunction.StDev(A) conf = excel.WorksheetFunction.Confidence(.45,thestdev,len(A)) # # Just printing out some data to prove that I computed things correctly # print 'Average: ' + str(np.average(A)) + ' Confidence Interval: ' + str(conf) # # Now we can refocus our query to perform another query by selecting records # that are 'like' the data we already have from the confidence interval. We # will then apply the query and will have found the data other parcels that are # statistically similar to the parcels we chose earlier # thetxt = "\"NFMLNDVL\" > " + str(np.average(A)-conf) + " AND " + "\"NFMLNDVL\" < " + str(np.average(A)+conf) print thetxt therows = arcpy.SearchCursor(parcels,thetxt) # # Now we can write out the FID for the selected features # for row in therows: print row.FID © Arthur J. Lembo, Jr. Salisbury University Real life scenario Crisfield Floods © Arthur J. Lembo, Jr. Salisbury University JM MB JY Scenario: The Town of Crisfield just got flooded this week after the rainstorms. You have been brought in to support the response to the disaster. The Mayor is getting hammered by the press, families, and the State Emergency Management Office: 1. Total number of properties (parcels) that are under 2 feet of water (flood layer, gridcode > 6) 2. Total value of the land (parcels layer, column "NFMTTLVL)" that is under 2 feet of water (flood layer, column gridcode > 6) 3. Total value of land, grouped by the landuse ("LU") that is under 2 feet of water (flood layer, column gridcode > 6) © Arthur J. Lembo, Jr. Salisbury University Scenario: The Town of Crisfield just got flooded this week after the rainstorms. You have been brought in to support the response to the disaster. The Mayor is getting hammered by the press, families, and the State Emergency Management Office: 4.What government buildings (gov_bldg) are under water? 5. Greg Sterling just called. He is out of town and wants to know if his house is under water. If it is, what is the maximum amount of water depth (gridcode). His account number is '2007124449‘ 6. The Highway Superintendent needs to know the names of the streets ("FULL_NAME") and their cross streets ("FROMCROSS", "TOCROSS") that are under 2 foot of water (gridcode = 5) so he can close off the street. © Arthur J. Lembo, Jr. Salisbury University ..and now a word from our students… © Arthur J. Lembo, Jr. Salisbury University Where to go next • • • • • How do I do that in Arcpy artlembo.com gisadvisor.com training courses Esri virtual campus Google, Google, Google – And don’t be surprised if you get directed to stackexchange • Get a book, any book © Arthur J. Lembo, Jr. Salisbury University Shameless plug • www.gisadvisor.com/ – Spatial SQL: A Language for Geographers – Python for Geospatial • How do I do that book series – How do I do that… © Arthur J. Lembo, Jr. Salisbury University