Download - Darren`s Data Analytics Blog

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

String literal wikipedia , lookup

Flow-based programming wikipedia , lookup

Indentation style wikipedia , lookup

Stream processing wikipedia , lookup

Programming language wikipedia , lookup

Abstraction (computer science) wikipedia , lookup

Falcon (programming language) wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

Control flow wikipedia , lookup

Data-intensive computing wikipedia , lookup

Functional programming wikipedia , lookup

Structured programming wikipedia , lookup

Object-oriented programming wikipedia , lookup

Reactive programming wikipedia , lookup

Corecursion wikipedia , lookup

Python syntax and semantics wikipedia , lookup

Python (programming language) wikipedia , lookup

Transcript
Programming For Big Data
Darren Redmond
• Programming Languages
•
•
•
•
•
Python
R
Java
C, C++
Ruby
Why, Why, Why
• History of Python
• Guido van Rossum – 1989 was bored at Christmas
• Why Python
•
•
•
•
•
•
•
•
Easy to learn
Powerful
Data structures
Modular
Embedding
Map Reduce / Lambda / Yield
Interactive Shell
http://www.python-course.eu/index.php
• The End Game
• http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
Interactive Interpreter
• python
• >>> print “Hello World”
• easier?
•
•
•
•
•
•
•
“Hello World”
12 / 7
12.0 / 7
3 + 2 * 4 # 11
_ # the most recent value
_ * 3 # 33
Ctrl-D
Execute Script
• Multiple ways to execute a script – below are 4 ways for a script called script-name.py:
• From command prompt - uncompiled
• python script-name.py
• From python interpreter – ensure to start python from directory of script for now.
• import py_compile
• py_compile.compile(‘script-name.py’)
• Compile from command line
• python –m py_compile script-name.py
• python –m compileall .
• Py and Pyc files available
Indentation
• Scope achieved through indentation – not brackets
• Auto creation and interpretation of variables
• i = 42
• i = i + 1 # 43
• print id(i)
• Types
• Numbers -> integers, long integers, floating point numbers, complex
• Strings -> functions – concat (+), slicing [2:4]
• Operators – input, raw_input
• Casting to list etc…
Conditional
• if
• elif
• else
• max = (a > b) ? a : b; This is an abbreviation for the following C code
• if (a > b)
• max=a;
• else
• max=b;
• C programmers have to get used to a different notation in Python
• max = a if (a > b) else b;
Looping
• #!/usr/bin/env python
• n = 100
• sum = 0
• i=1
• while i <= n:
• sum = sum + i
• i=i+1
• print "Sum of 1 until %d: %d" % (n,sum)
Bibliography
• Python for Data Analysis
• Data Wrangling with Pandas, NumPy, and Ipython
• Wes McKinney, O’Reilly, 2012
• Programming Python, 4th Edition
• Powerful Object-Oriented Programming
• Mark Lutz, O’Reilly, 2010
• Agile Data Science – Building Data Analytics with Hadoop
• Russell Jurney, O’Reilly, 2013
• Functional Python Programming
• Steven Lott, Pakt Publishing, 2015
Practice, Practice, Practice
• From Lecture 1 you should be able to write a python script file to do
calculations and print them to the screen
• Write a program to print ‘Hello World’ to the screen
• Write a program to sum the first 100 numbers
• Write a program to multiply the first 10 numbers
Summary
• Programming Languages for Big Data
• Why Python
• Hello World
• Executing a Script
• Indentation
• Conditional Programming
• Looping
• Bibliography
• Practice