Download From Gaming to GIS: Evaluating Parallel Processing Using NVIDIA

Document related concepts
no text concepts found
Transcript
Python for Geospatial
Arthur Lembo, Matthew Bucklew,
Jessica Molnar, Joshua Young
Department of Geography and
Geoscience
© Arthur J. Lembo, Jr.
Salisbury University
Overview
• Python Language (8:30 – 9:45)
–
–
–
–
Variables
Structures
Statements
Program control
• Python Packages (9:45 – 10:45)
– Google
– Microsoft
– Esri
• Arcpy Examples (10:45 – 11:45)
• Student Presentations (11:45 – 12:00)
© Arthur J. Lembo, Jr.
Salisbury University
Purpose
• Introduction to Python for geospatial
analysis
• Basic Python
• Spatial and database tools
– qGIS
– Arcpy
• Special Python packages for geospatial
–
–
–
–
–
Postgres/PostGIS; SQLite
Microsoft Excel
Geocoding
Charting
Arthur J. Lembo, Jr.
Manifold GIS ©Salisbury
University
At the end of this workshop
you will
• Understand the benefits of scripting
geospatial solutions using Python.
• Understand the basics of the Python
language such as variables, flow control,
data structures, and reading/writing data.
• Know how to utilize external Python
packages for geospatial analysis.
• Know how to integrate multiple packages
together into a cohesive solution.
© Arthur J. Lembo, Jr.
Salisbury University
What is Python
Python is an interpreted, interactive, object-oriented programming
language. It incorporates modules, exceptions, dynamic typing,
very high level dynamic data types, and classes. Python combines
remarkable power with very clear syntax. It has interfaces to many
system calls and libraries, as well as to various window systems,
and is extensible in C or C++. It is also usable as an extension
language for applications that need a programmable interface.
Finally, Python is portable: it runs on many Unix variants, on the
Mac, and on Windows 2000 and later. – www.python.org
• A Tour of Python
• How to find what you want
– How do I run a for loop?
– How do I read from a file?
© Arthur J. Lembo, Jr.
Salisbury University
The many flavors of Python
• Versions
– 2.6, 2.7, 2.8
– 3.0, 3.3, 3.6
• Interpreters
– IDLE
– WinPy
– Anaconda
• The Locations – and why it is important!
– Python directories
– Arcpy directory
© Arthur J. Lembo, Jr.
Salisbury University
Starting with IDLE and
Winpy
• Working with the shell
• Working with files
© Arthur J. Lembo, Jr.
Salisbury University
The Python Language
•
•
•
•
•
•
Python as a calculator
Variables and data types
Structures
Statements
Expressions
Methods and Functions
© Arthur J. Lembo, Jr.
Salisbury University
Python as a calculator
• Just enter some
numbers
© Arthur J. Lembo, Jr.
Salisbury University
JM
Quiz
• What is the square root of 7 x 8?
• What is 12 raised to the power of 4?
• What is 5 factorial?
© Arthur J. Lembo, Jr.
Salisbury University
Variables and Data Types
• A variable is a name for a value. The computer
stores the value in memory. Depending upon
what the variable type is, certain operations can
be done with the variable.
• Some variable types include:
– Strings
– Numbers
• Integer
• Floating Point
– Lists
© Arthur J. Lembo, Jr.
Salisbury University
Variables and methods
• String
len, split, find
replace,left/right([:],[-:])
• Number
floor,round, / vs. // vs.
float, round
• List
append, len, max, min, sort
© Arthur J. Lembo, Jr.
Salisbury University
Structures - Lists
a = [66.25, 333, 333, 1, 1234.5]
a[1]
print a.reverse
a.insert(2,1.00)
a.remove(1.0)
a.pop(2)
sortlist = a; sortlist.sort();
sortlist
© Arthur J. Lembo, Jr.
Salisbury University
MB
Quiz
• How many letters are in
supercalifragilisticexpialidocious?
• What is the 9th letter in the word?
• Use split to create a list of the Beatles:
‘John’,’George’,’Paul’,’Ringo’
• Create two variables:
– Firstname = ‘John’
– Lastname = ‘Lennon’
• Append the two names into a new
variable called FullName
© Arthur J. Lembo, Jr.
Salisbury University
Python as a module
• Add code
• Save
• Run the module
array = [12, 9, 17, 16]
sumtotal = 0
for i in array:
sumtotal = i + sumtotal
print sumtotal
print sum(array)
• Quiz
– Instead of the sum, calculate the sum of
the squares (use the pow function)
– Print out the sum of the squares and
the average sum of the squares
© Arthur J. Lembo, Jr.
Salisbury University
Program Design
Code Writing
Testing
Error Correction
Logic Correction
© Arthur J. Lembo, Jr.
Salisbury University
Algorithms (Program Design)
• Most important part of your program
• Practice algorithm as a group
– Inputs
– Uses
– Outputs
– Uses
© Arthur J. Lembo, Jr.
Salisbury University
Algorithm - a finite list of well-defined
instructions for accomplishing some
task
• Algorithms are essential to the way computers process
information
• Algorithms must be rigorously defined:
– specified in the way it applies in all possible circumstances that
could arise.
– conditional steps must be systematically dealt with, case-by-case;
– the criteria for each case must be clear (and computable).
array = [2,4,6,3]
largest=0
for i in array:
if i>largest:
largest=i
print largest
Have variable for
- largest
Look up in Help:
- list
- for loop
- if..then..else
© Arthur J. Lembo, Jr.
Salisbury University
JM
Quiz - Temperature program
Now use the input statement
© Arthur J. Lembo, Jr.
Salisbury University
The Python Language
© Arthur J. Lembo, Jr.
Salisbury University
Lists
• Python supports arrays in the form of lists.
These include a basic list:
mylist = ["art","bob","sue"]
mylist[0]
• or, lists within lists:
mylist = [["art",52],["bob",19],["Sue",44]]
mylist[0]
mylist[0][0]
• And, we can do things with lists:
sort, remove, append….
© Arthur J. Lembo, Jr.
Salisbury University
Dictionaries
Dict = {'Name': 'Art', 'Age': 52, 'Area': 69 * 44}
print dict['Name']
© Arthur J. Lembo, Jr.
Salisbury University
states = {
'Oregon': 'OR',
'Florida': 'FL',
'California': 'CA',
'New York': 'NY',
'Michigan': 'MI'
}
cities = {
'CA': 'San Francisco',
'MI': 'Detroit',
'FL': 'Jacksonville'
}
cities['NY'] = 'New York'
cities['OR'] = 'Portland‘
print "NY State has: ", cities['NY']
© Arthur J. Lembo, Jr.
Salisbury University
Methods and Functions
Files
– Read a file
– Write a file
f =
open('c:/tugis/python/addresses.csv')
for line in f:
print line
f = open('c:/tugis/python/addresses.csv')
o = open('c:/tugis/python/addout.csv','w')
for line in f:
o.write(line)
o.close()
© Arthur J. Lembo, Jr.
Salisbury University
Quiz
• Read the file ‘benchmarks.csv’
• Write the file out to
‘benchmarksnew.csv’
© Arthur J. Lembo, Jr.
Salisbury University
Statements and Control
• Change variables (str, int, float)
• Work with external data(import,
export)
• Program Control
– if
– while
– for
© Arthur J. Lembo, Jr.
Salisbury University
Python modules (packages)
• A module is a file containing Python
definitions and statements.
import math
math.cos(34)
• There are many, many “standard”
modules built in Python by default
• There are many, many, many, MANY
other packages you can install
© Arthur J. Lembo, Jr.
Salisbury University
Default modules
• Modules for Python 2.7
– time
– math
– HTMLParser
– urllib
– sqlite3
© Arthur J. Lembo, Jr.
Salisbury University
SQLite Example (not spatial,
yet)
import sqlite3
conn =
sqlite3.Connection('c:/tugis/spatialite/sql.sqlite')
answer = conn.execute('SELECT * FROM parcels LIMIT 3')
for row in answer:
print row
Quiz
• Find the average ASMT value for
each Propclass in parcels
© Arthur J. Lembo, Jr.
Salisbury University
pip – the magic of Python
• Run pip to install third party modules:
pip install numpy
pip install googlemaps
pip install pygal
pip install geocoder
pip is a package management system used to install
and manage software packages written in Python.
Many packages can be found in the Python Package
Index (PyPI). Python 2.7.9 and later (on the
python2 series), and Python 3.4 and later
include pip (pip3 for Python 3) by default.
© Arthur J. Lembo, Jr.
Salisbury University
Those dreaded Python
locations
python -m pip install psycopg2 --target=C:\Python27\ArcGIS10.3\Lib\site-packages
© Arthur J. Lembo, Jr.
Salisbury University
Now for the spatial stuff
Geocoding: geopy, geocoder, censusgeocoder, Google
Projections: Vincenty
Spatial operations: shapely
Spatial database: sqlite3
© Arthur J. Lembo, Jr.
Salisbury University
Geocoding
import requests.packages.urllib3
requests.packages.urllib3.disable_warnings()
import geocoder
f = open('c:/tugis/python/addresses.csv','r')
googleout = open('c:/tugis/python/googleoutaddress.csv','w')
googleout.write("\"address\",\"lon\",\"lat\",\"matchcode\" \n")
for line in f:
gc = geocoder.google(line)
googleout.write(gc.address.replace(',','') + ' , ' + str(gc.lng) + ','
+ str(gc.lat) + ',' + gc.accuracy + '\n')
print line," has an accuracy of ", gc.accuracy
googleout.close()
f.close()
© Arthur J. Lembo, Jr.
Salisbury University
Benchmark Accuracies
import geocoder, csv
f = open('c:/tugis/python/benchmarks.csv', 'r')
newf = csv.reader(f)
i=0
for line in newf:
if i > 0:
gc = geocoder.google([line[1],line[0]],method='elevation')
diff = gc.elevation - float(line[6])
print str(gc.elevation) + ' - ' + str(line[6]) + ' = ' + str(diff)
i = i + 1
© Arthur J. Lembo, Jr.
Salisbury University
Vincenty
import csv, vincenty
f = open('c:/tugis/python/benchmarks.csv', 'r')
newf = csv.reader(f)
newyork = (40.7791472, -73.9680804)
i=0
for line in newf:
if i > 0:
mypoint = (float(line[1]),float(line[0]))
vincenty.vincenty(mypoint, newyork, miles=True)
i=i+1
f.close()
© Arthur J. Lembo, Jr.
Salisbury University
SpatiaLite Example
© Arthur J. Lembo, Jr.
Salisbury University
SpatiaLite Setup
• You can grab the SpatiaLite .dlls
from here, or use the accompanying
.zip file from the training packet:
– http://www.gaia-gis.it/gaia-sins/
• Make sure to choose 32 or 64 bit,
whatever is appropriate
• Place the .dll in a directory you can
reference in your code
© Arthur J. Lembo, Jr.
Salisbury University
Spatialite Setup
• If you can set the PATH on your
system, add the directory that you
installed the .dlls:
– PATH =
%PATH%;c:\tugis\python\resource
s(32bit)\
• Or, in the DOS prompt, set the path
(like above) and then run Python
from that DOS prompt
– C:\Python27\Lib\idlelib\idle.bat
© Arthur J. Lembo, Jr.
Salisbury University
#this script connects to spatialite using the 32-bit spatialite.dll
import sqlite3
con = sqlite3.connect('c:/tugis/spatialite/sql.sqlite')
con.enable_load_extension(True)
con.load_extension('spatialite.dll')
a = con.execute('SELECT parks.geometry, parks.* from parks, parcels ' \
'where st_intersects(parks.geometry,parcels.geometry) limit 5')
for row in a:
print row[3]
© Arthur J. Lembo, Jr.
Salisbury University
MB
Quiz
• Find the sum of the assessment
(ASMT) for all the parcels that
intersect the X zone in the firm layer.
• Now, do the same thing, but GROUP
BY the propclass value.
• Now find the ASMT total for each
zone.
© Arthur J. Lembo, Jr.
Salisbury University
Microsoft Excel
© Arthur J. Lembo, Jr.
Salisbury University
Quick and dirty Excel
pip install pypiwin32
import win32com.client
excel = win32com.client.Dispatch("Excel.Application")
mylist = [10,20,30,40,50,60,70,80,90]
myAvg= excel.WorksheetFunction.Average([mylist])
print myAvg
© Arthur J. Lembo, Jr.
Salisbury University
Excel quiz
• What is the standard deviation of the
list?
• Create another list called mylist 2:
[16,22,30,45,58,69,77,88,99]
• What is the Pearson correlation
coefficient?
© Arthur J. Lembo, Jr.
Salisbury University
# Here we are importing the APIs. Specifically, we will need
# sqlite3 to talk to sqlite, win32com to talk to Excel
# and numpy to do some mathematical operations
import sqlite3, os, time, win32com.client, numpy as np
# here we make the connection to Excel....
excel = win32com.client.Dispatch("Excel.Application")
con = sqlite3.Connection('c:/tugis/spatialite/sql.sqlite')
rows = con.execute('SELECT ASMT FROM parcels limit 50')
rows.fetchall
newrow = []
# Just a text to print out the rows
#for row in rows:
#
print "The value is:
", row[0]
#
newrow.append([row[0]])
# Here we will issue an Excel function to determine the standard deviation from the
# selected values
#thestdev = excel.WorksheetFunction.StDev(newrow)
#print thestdev
# Now we will add the confidence interval, as calculated by Excel
#conf = excel.WorksheetFunction.Confidence(.45,thestdev,len(newrow))
#print 'Average: ' + str(np.average(newrow)) + ' Confidence Interval: ' + str(conf)
#print' '
#print ' '
#print str(np.average(newrow) - conf) + '
' + str(np.average(newrow)) + '
' +
str(np.average(newrow) + conf)
con.close()
© Arthur J. Lembo, Jr.
Salisbury University
Arcpy Examples
© Arthur J. Lembo, Jr.
Salisbury University
Working with Arcpy in
ArcGIS
• Fire up ArcMap
• Click Python window
• Start typing code
–Look up help files – they are
excellent!
© Arthur J. Lembo, Jr.
Salisbury University
Getting Pywin32 to work
Copy
C:\Python27\ArcGIS10.3\Lib\sitepackages\Desktop10.3
to
C:\Python27\ArcGIS10.3\Lib\sitepackages\Desktop10.3
© Arthur J. Lembo, Jr.
Salisbury University
Buffer parcels and critical
area
import arcpy
arcpy.env.overwriteOutput = True
arcpy.env.workspace ="c:/tugis/python/Denton.gdb"
catemp = arcpy.MakeFeatureLayer_management("CA")
testvar = input("How far do you want the buffer")
cabuffer =
arcpy.Buffer_analysis(catemp,"cabuffer",testvar)
caclip =
arcpy.Clip_analysis("parcels",cabuffer,"parclip")
arcpy.Delete_management(cabuffer)
#arcpy.Delete_management(caclip)
© Arthur J. Lembo, Jr.
Salisbury University
MB
Total value of land in the
critical area by land use
import arcpy
import numpy
arcpy.env.workspace ="C:/tugis/python/Denton.gdb"
parceltemp = arcpy.MakeFeatureLayer_management("parcels")
catemp = arcpy.MakeFeatureLayer_management("CA")
LUs = [row[0] for row in arcpy.da.SearchCursor(parceltemp,"LU")]
LUs = list(set(LUs))
for lu in LUs:
parsel = arcpy.SelectLayerByLocation_management(parceltemp,"INTERSECT",catemp)
parsubsel = [row[0] for row in arcpy.da.FeatureClassToNumPyArray(parsel,"NFMTTLVL", "LU = '"
+ lu + "'")]
totalvalue = numpy.sum(parsubsel)
avgvalue = numpy.average(parsubsel)
stdvalue = numpy.std(parsubsel)
print "total value of " + lu + "= " + str(totalvalue) #+ " and the CV is " + str(stdvalue /
avgvalue)
Move the parsel out of the loop
© Arthur J. Lembo, Jr.
Salisbury University
Total value…but easier
import arcpy
import numpy
arcpy.env.workspace = "C:/tugis/python/Denton.gdb"
parceltemp = arcpy.MakeFeatureLayer_management("parcels")
catemp = arcpy.MakeFeatureLayer_management("CA")
parsel =
arcpy.SelectLayerByLocation_management(parceltemp,"INTERSECT",catemp)
parsummarize =
arcpy.Statistics_analysis(parsel,None,[["NFMTTLVL","SUM"]],"LU")
for row in arcpy.SearchCursor(parsummarize):
print row.LU + "
$" + str(row.SUM_NFMTTLVL)
© Arthur J. Lembo, Jr.
Salisbury University
Arcpy Potpourri
import arcpy, numpy, pygal
arcpy.env.workspace = "c:/tugis/python/denton/denton.gdb"
arcpy.env.overwriteOutput = True
acctnum = '0603012174'
##acctnum = raw_input('Enter Account Number: ')
parcels = arcpy.MakeFeatureLayer_management("parcels", "parcels_lyr")
soils = arcpy.MakeFeatureLayer_management("soils", "soils_lyr")
selpar = arcpy.SelectLayerByAttribute_management(parcels,"NEW_SELECTION","ACCTID = '" + acctnum +"'")
clip_soil = arcpy.Clip_analysis(soils,selpar,"clipsoil")
soil_stats = arcpy.Statistics_analysis(clip_soil,"soil_stats",[["Shape_Area","SUM"]],["MUSYM"])
mycur = arcpy.SearchCursor(soil_stats)
pie_chart = pygal.Pie()
pie_chart.title = 'Soil Amount'
for row in mycur:
pie_chart.add(row.MUSYM,row.SUM_Shape_Area)
pie_chart.render_to_file('c:/tugis/python/' + acctnum + '.svg')
© Arthur J. Lembo, Jr.
Salisbury University
Fuzzy Neuro-logic in Arcpy
# Import the necessary environment. This includes
# arcpy, maybe the os or sys, or numpy for mathematics
#
import arcpy
import numpy as np
import win32com.client
excel = win32com.client.Dispatch("Excel.Application")
# Set up the workspace you are working in. That
# gets us to the data. Here we get the directory
# and also the feature data set
arcpy.env.workspace = "c:/tugis/python/denton.gdb"
parcels = arcpy.MakeFeatureLayer_management("parcels")
ca = arcpy.MakeFeatureLayer_management("ca")
#
# 90% of the time we will be doing some kind of selection.
# Here we are going to use the SearchCursor to return some
# features based on an attribute query
#
rows = arcpy.SearchCursor(parcels,"'LU' = 'Residential'")
#
# Once we get a result from a query, we have to get into a
# an array in order to manipulate it. So, we'll create an
# empty array and then fill it with a field value
#
A = []
for row in rows:
A.append(row.NFMLNDVL)
#
# In this case, we are going to pass the data to Excel in order
# to determine a confidence interval. This allows us access to a
# statistical function offered by Excel that ArcGIS does not have
#
thestdev = excel.WorksheetFunction.StDev(A)
conf = excel.WorksheetFunction.Confidence(.45,thestdev,len(A))
#
# Just printing out some data to prove that I computed things correctly
#
print 'Average: ' + str(np.average(A)) + ' Confidence Interval: ' + str(conf)
#
# Now we can refocus our query to perform another query by selecting records
# that are 'like' the data we already have from the confidence interval. We
# will then apply the query and will have found the data other parcels that are
# statistically similar to the parcels we chose earlier
#
thetxt = "\"NFMLNDVL\" > " + str(np.average(A)-conf) + " AND " + "\"NFMLNDVL\" < " + str(np.average(A)+conf)
print thetxt
therows = arcpy.SearchCursor(parcels,thetxt)
#
# Now we can write out the FID for the selected features
#
for row in therows:
print row.FID
© Arthur J. Lembo, Jr.
Salisbury University
Real life scenario
Crisfield Floods
© Arthur J. Lembo, Jr.
Salisbury University
JM
MB
JY
Scenario: The Town of Crisfield just got flooded this week after the
rainstorms. You have been brought in to support the response to
the disaster. The Mayor is getting hammered by the press, families,
and the State Emergency Management Office:
1. Total number of properties (parcels) that are under 2 feet of water
(flood layer, gridcode > 6)
2. Total value of the land (parcels layer, column "NFMTTLVL)" that
is under 2 feet of water (flood layer, column gridcode > 6)
3. Total value of land, grouped by the landuse ("LU") that is under 2
feet of water (flood layer, column gridcode > 6)
© Arthur J. Lembo, Jr.
Salisbury University
Scenario: The Town of Crisfield just got flooded this week after the
rainstorms. You have been brought in to support the response to
the disaster. The Mayor is getting hammered by the press,
families, and the State Emergency Management Office:
4.What government buildings (gov_bldg) are under water?
5. Greg Sterling just called. He is out of town and wants to know if
his house is under water. If it is, what is the maximum amount of
water depth (gridcode). His account number is '2007124449‘
6. The Highway Superintendent needs to know the names of
the
streets
("FULL_NAME")
and
their
cross
streets
("FROMCROSS", "TOCROSS") that are under 2 foot of water
(gridcode = 5) so he can close off the street.
© Arthur J. Lembo, Jr.
Salisbury University
..and now a word from our
students…
© Arthur J. Lembo, Jr.
Salisbury University
Where to go next
•
•
•
•
•
How do I do that in Arcpy
artlembo.com
gisadvisor.com training courses
Esri virtual campus
Google, Google, Google
– And don’t be surprised if you get
directed to stackexchange
• Get a book, any book
© Arthur J. Lembo, Jr.
Salisbury University
Shameless plug
• www.gisadvisor.com/
– Spatial SQL: A Language for Geographers
– Python for Geospatial
• How do I do that book series
– How do I do that…
© Arthur J. Lembo, Jr.
Salisbury University