Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PROGRAMMING FOR AUTOMATION OF COMMON DATA MANAGEMENT TASKS HYDROINFORMATICS DR. DAN AMES, BRIGHAM YOUNG UNIVERSITY OUTLINE Tuesday: • Introducing Python • Brief Python example using ArcPy • Key Python coding concepts and conventions Thursday: • PYODBC WHAT IS PYTHON? Python is an: “Interpreted Scripting Language” Each line of your script performs one command. (As a side note… VBA is also an interpreted language. So is JavaScript. This means that the computer reads the code and runs it directly from the text file. C# is “compiled” which means a binary “DLL” or “EXE” file is created that contains the instructions for the computer.) WHAT IS PYTHON? WHY USE PYTHON? • Automation of simple or complex processes • Extensive data processing capabilities • Access to the file system and operating system • Access to advanced math, statistics, and database functions • Integrated in ArcGIS and other applications • Web access • Easy to use (simple syntax) • Lightweight, open source • And so much more… PYTHON MODULES 50 Top Python Modules (from www.CatsWhoCode.com) Think Legos… A REAL WORLD EXAMPLE: LAKE CONTOUR DATA PROCESSING The problem: How Convert 121 lake bottom contour shapefiles into DEM rasters in a new file format that ESRI doesn’t support (MapWindow BGD)? A REAL WORLD EXAMPLE: LAKE CONTOUR DATA PROCESSING Original Solution: Manually run each data set through a ModelBuilder model, and then through a MapWindow convertor… About 10 minutes hands on time per data set. That’s 20 hours… A REAL WORLD EXAMPLE: LAKE CONTOUR DATA PROCESSING Python Solution: Spend some time making a decent script, point it at a folder, start it running and go do something else… A REAL WORLD EXAMPLE: LAKE CONTOUR DATA PROCESSING Python Solution: Spend some time making a decent script, point it at a folder, start it running and go do something else… Watch the video for a detailed explanation of the code… http://youtu.be/dsUYv3R858Q KEY PYTHON CODING CONCEPTS AND CONVENTIONS The Python Command Line Interpreter Interactive interface to Python % python Python 2.6 (r26:66714, Feb 3 2009, 20:49:49) [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> Python interpreter evaluates inputs: >>> 3*(7+2) 27 TRY IT… Try running Python from the Command Line • In Windows, click the start menu • In the “search programs and files” box, type “Command Prompt” or “cmd” • In the command prompt window, type “python” TRY IT… Try running Python from the Command Line • Try typing some mathematical expressions: >>> 1 + 1 >>> 8 - 100 >>> 9 + 0 • Try setting a variable and using it: >>> a = 3 >>> b = 4 >>> c = (a ** 2 + b ** 2) ** 0.5 >>> c >>> 5.0 The IDLE GUI Environment (Windows) Shell for interactive evaluation of Python code. Text editor with color-coding and smart indenting for creating Python files. Menu commands for changing system settings and running files. TRY IT… Try running the Python interpreter through Idle • In Windows, click the start menu • In the “search programs and files” box, type “idle” • This should open the Idle editor. • You can also open it from the Start Menu directly. Main Python Programming Concepts Comments are identified by a # sign Indentation matters to the meaning of the code: Block structure indicated by indentation The first assignment to a variable creates it. Variable types don’t need to be declared. Python figures out the variable types on its own. Assignment uses = and comparison uses == For numbers + - * / % are as expected. Special use of + for string concatenation. Special use of % for string formatting (as with printf in C) Logical operators are words (and, or, not) Simple printing can be done with print. PRIMARY PYTHON DATA TYPES Numbers Strings Lists Functions LISTS AND ARRAYS Python Arrays 1-D 1D >>>x = numpy.array([0,0,5,0,0,0]) >>>x[2] 0 0 5 0 0 0 0 0 5 99 0 0 5 >>> x[3] = 99 >>> x array([ 0, 0, 5, 99, 0, 0]) Remember that Python uses “zero-based arrays”. 2 = 3 (the subscript 2 refers to the 3rd element in the array) LISTS AND ARRAYS Python Arrays 2-D 2D >>> x = numpy.array([[90,91,92],[55,56,57]]) >>> x[1][1] 56 90 91 92 >>> x[1][2] = 109 55 56 57 90 91 92 55 56 109 >>> x array([[ 90, [ 55, 91, 92], 56, 109]]) >>> Remember that Python uses “zero-based arrays”. Also, rather than declare the array size, you just fill it with values in when creating it. The number of values indicates the size. LISTS AND ARRAYS Python Lists • A list is a container that holds a number of other objects in a given order. • To create a list, put a number of expressions in square brackets: >>> L1 = [] # This is an empty list >>> L2 = [90,91,92] # This list has 3 integers >>> L3 = ["Captain America","The Hulk","Iron Man","Black Widow"] >>> print("My favorite superhero is " + L3[3]) My favorite superhero is Black Widow To get a range of elements from a list use: L2[start:stop] LOOPS Example 2: # range() is used to quickly create a sequence list for i in range(100): print i, i*100 Example 3: a = ['Mary', 'had', 'a', 'little', 'lamb'] for i in range(len(a)): print i, a[i] LOOPS Example 4: # os.listdir returns a list of items in a folder folderitems = os.listdir("c:/temp") for f in folderitems : print(f) IF-THEN STATEMENTS Remember Visual Basic? Python is similar… But different. If x > 0 Then if x > 0: ‘do something ElseIf x = 0 Then ‘do something else Else ‘and something else ‘do something elif x == 0: ‘do something else else: ‘and something else End If How is the syntax similar? How is it different? IF-THEN STATEMENTS Example 4 (Continued): Try to find all the shapefiles in a folder… Notice the forward slashes! Try playing with Python NUMBERS >>> 2+3 5 >>> 2/3 0 >>> 2.0/3 0.66666666666666663 >>> x=4.5 >>> int(x) 4 Try playing with Python STRINGS >>> x='abc' >>> x[0] 'a' >>> x[1:3] 'bc' >>> x[:2] 'ab’ >>> x[1]='d' Traceback (most recent call last): File "<pyshell#14>", line 1, in <module> x[1]='d' TypeError: 'str' object does not support item assignment Try playing with Python LISTS >>> x=['a','b','c'] >>> x[1] 'b' >>> x[1:] ['b', 'c'] >>> x[1]='d' >>> x ['a', 'd', 'c'] Try playing with Python FUNCTIONS >>> def p(x): if len(x) < 2: return True else: return x[0] == x[-1] and p(x[1:-1]) >>> p('abc') False >>> p('aba') True >>> p([1,2,3]) False >>> p([1,’a’,’a’,1]) True >>> p((False,2,2,False)) True >>> p((’a’,1,1)) False USING A SCRIPT FILE • Run the program: Start menu –– Python 2.7 – IDLE ( • Click “File” “New Window” to get a new blank script file (like Notepad text editor) See results here • Type your lines of code and save the file as ‘test.py’ • Click “Run” and “Run Module”. This will run the script and the results will appear in the shell window… x = 34Save, - 23 # A comment. then use “run module” y = "Hello" # Another one. z = 3.45 if z == 3.45 or y == "Hello": x=x+1 y = y + "World" # String concat. Type print x several lines of code here print y USING A SCRIPT FILE Step 1: Start the Python Shell “IDLE”. You can find it in your Start Menu under ArcGIS/Python 2.6/IDLE (Python GUI). This is another command line interpreter that comes by default with Python (and with ArcMap). Feel free to try a few commands just to make sure… USING A SCRIPT FILE Step 2: Start a new Python text file. You can write Python in Notepad, but if you use a formal editor, you get color coding! Click “File/New Window”. This will load a blank Python text file. Write your script and save it. Then click “Run/Run Module” to test it…. Results appear in the “shell” window. USING A SCRIPT FILE You try it! A Code Sample (in IDLE) x = 34 - 23 # A comment. y = “Hello” # Another one. z = 3.45 if z == 3.45 or y == “Hello”: x = x + 1 y = y + “World” # String concat. print x print y THREE USEFUL MODULES What is a module? • Like an “extension” or “plugin” • Adds additional functionality to Python, not in the core environment. • In the interpreter type, help("modules") to get a list of all currently installed modules. • Use a module in your code, first import it: import os import shutil import arcpy THREE USEFUL MODULES Let’s look at the module, “os” • Think of it as a connector into your current operating system • Use it to do useful file level activities os.getcwd() returns the current working directory os.curdir returns the current directory os.execl executes an executable file os.abort fails in the hardest way possible os.listdir returns a list of strings of names of entries in a folder os.path.exists() tests whether a path exists os.path.basename() returns the final component of a path name os.path.isdir() returns true for a directory, false for a filename THREE USEFUL MODULES Let’s look at the module, “shutil” • Think of it as a high level file/folder manipulator shutil.copy, copyfile, copytree copy things shutil.abspath returns the absolute path to a file Shutil.fnmatch test if a file path matches a pattern shutil.move move a file or folder to another destination shutil.rmtree recursively remove the contents of a directory tree THREE USEFUL MODULES Let’s look at the module, “pyodbc” • Think of it as a database connector pyodbc.connect connect to a database connection.cursor a “pointer” to a row in a database cursor.execute run an SQL query cursor.fetchone get a row from a database row.some_field_name returns the value of a field in a row LET’S LOOK AT SOME ODM/ODBC CODE… import pyodbc import csv #Function for getting a list of sites from a HydroServer def GetSites(VariableID): cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=hydroserver.uwrl.usu.edu;DATABASE=LittleBearRiverODM;UID=Hydroinformatics;PWD=F4ll2013!!') crsr = cnxn.cursor() q = """ select Sites.SiteID, Sites.SiteName, min(DataValues.LocalDateTime) as BeginDate, max(DataValues.LocalDateTime) as EndDate, count(DataValues.DataValue) as Observations from Sites inner join DataValues on Sites.SiteID = DataValues.SiteID where DataValues.VariableID =""" + str(VariableID) + """ and DataValues.DataValue <> -9999 group by Sites.SiteID, Sites.SiteName order by Sites.SiteID""" crsr.execute(q) f = open("GetSites.csv", "w") for row in crsr: f.write('"' + '","'.join([str(s) for s in row]) + '"') f.write('\n') f.close() cnxn.close() #Function for getting a data set from a specific site def GetValues(VariableID, SiteID): rows = [] cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=hydroserver.uwrl.usu.edu;DATABASE=LittleBearRiverODM;UID=Hydroinformatics;PWD=F4ll2013!!') crsr = cnxn.cursor() q = """select Sites.SiteID, Sites.SiteName, DataValues.LocalDateTime, DataValues.DataValue from Sites inner join DataValues on Sites.SiteID = DataValues.SiteID where Sites.SiteID =""" + str(SiteID) + """ and DataValues.VariableID =""" + str(VariableID) + """ and DataValues.DataValue <> -9999 group by Sites.SiteID, Sites.SiteName, DataValues.LocalDateTime, DataValues.DataValue order by DataValues.LocalDateTime""" crsr.execute(q) f = open("GetValues.csv", "w") for row in crsr: f.write('"' + '","'.join([str(s) for s in row]) + '"') f.write('\n') rows.append(row[3]) f.close() cnxn.close() return rows #Main Program #find sites that have streamflow (variable 44) GetSites(44) #Download data for variable 44 at site 1 (Mendon Rd) r = GetValues(44, 1)