Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advanced introduction to Python (draft) Fredrik Wahlberg Scripted Hello World Python 2.7.5+ (default, Sep 19 2013, 13:48:49) [GCC 4.8.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print "Hello World" Hello World >>> $ python helloworld.py Hello World # Comment print “Hello World” Math >>> 1+1 2 >>> 2345234523452345+234587347826948376 236932582350400721 >>> 123*123 15129 >>> 10**3 1000 >>> 2.8**3 21.951999999999995 >>> 10-4 6 >>> 6+4 10 >>> 9/3 3 >>> 9/2 Integer division 4 >>> -9/2 -5 >>> 13%2 1 >>> 13%5 3 >>> -13%5 2 >>> 2*2 4 >>> 32*32 1024 >>> 256*256 65536 >>> 65536*65536 4294967296 >>> 2**32 4294967296 >>> 2**33 8589934592 >>> 2**34 17179869184 >>> 2**100 1267650600228229401496703205376L >>> 2.2345*.234 0.522873 >>> (2 + 4j)*3j (-12+6j) Data types >>> list = [] >>> dict >>> list {} [] >>> dict['a'] = 'b' >>> list = ['a', 'b', 1, 2] >>> dict['c'] = 10 >>> list >>> dict ['a', 'b', 1, 2] {'a': 'b', 'c': 10} >>> list[2] >>> dict['some_list'] = [1, 2, 3] 1 >>> dict['another_dictionary'] = {'mykey': >>> list[:2] 'mydata', 1:2} ['a', 'b'] >>> dict >>> list[2:] {'a': 'b', 'c': 10, 'some_list': [1, 2, 3], [1, 2] 'another_dictionary': {1: 2, 'mykey': >>> list[1:3] 'mydata'}} ['b', 1] >>> >>> list.append('c') >>> list.append('d') >>> list ['a', 'b', 1, 2, 'c', 'd'] Comparison Python 2.7.5+ (default, Sep 19 2013, 13:48:49) [GCC 4.8.1] on linux2 >>> not 2+2==4 False >>> 2+2==4 Type "help", "copyright", "credits" or "license" for more information. True >>> 1==1 >>> 'abc' < 'ABC' True False >>> 1==2 >>> 'abc' > 'ABC' False True >>> 1!=2 >>> 'abc' == 'ABC' True False >>> 'abc' == 'abc' True >>> Control flow & functions if 2+2 == 4: print “2+2=4” elseif 2+2==3: print “2+2=3” else: print “2+2=?” i = 0 while i < 100: i = i + 1 for i in range(100): print i >>> def somefunction(a, b): ... print a+b ... >>> somefunction(1, 2) 3 >>> Modules and docstrings >>> print nltk.model.NgramModel.__init__.__doc__ Python 2.7.5+ (default, Sep 19 2013, 13:48:49) [GCC 4.8.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import nltk >>> print nltk.__doc__ The Natural Language Toolkit (NLTK) is an open source Python library for Natural Language Processing. A free online book is available. (If you use the library for academic research, please cite the book.) Steven Bird, Ewan Klein, and Edward Loper (2009). Natural Language Processing with Python. O'Reilly Media Inc. http://nltk.org/book @version: 2.0.4 >>> print nltk.model.NgramModel.__doc__ A processing interface for assigning a probability to the next word. >>> dir(nltk.model) ['NgramModel', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', 'api', 'ngram'] Create an ngram language model to capture patterns in n consecutive words of training text. An estimator smooths the probabilities derived from the text and may allow generation of ngrams not seen during training. >>> from nltk.corpus import brown >>> from nltk.probability import LidstoneProbDist >>> est = lambda fdist, bins: LidstoneProbDist(fdist, 0.2) >>> lm = NgramModel(3, brown.words(categories='news'), estimator=est) >>> lm <NgramModel with 91603 3-grams> >>> lm._backoff <NgramModel with 62888 2-grams> >>> lm.entropy(['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ... 'Friday', 'an', 'investigation', 'of', "Atlanta's", 'recent', ... 'primary', 'election', 'produced', '``', 'no', 'evidence', ... "''", 'that', 'any', 'irregularities', 'took', 'place', '.']) ... # doctest: +ELLIPSIS 0.5776... :param n: the order of the language model (ngram size) :type n: int :param train: the training text :type train: list(str) or list(list(str)) :param pad_left: whether to pad the left of each sentence with an (n-1)-gram of empty strings :type pad_left: bool :param pad_right: whether to pad the right of each sentence with an (n-1)-gram of empty strings :type pad_right: bool :param estimator: a function for generating a probability distribution :type estimator: a function that takes a ConditionalFreqDist and returns a ConditionalProbDist :param estimator_args: Extra arguments for estimator. These arguments are usually used to specify extra properties for the probability distributions of individual conditions, such as the number of bins they contain. Note: For backward-compatibility, if no arguments are specified, the number of bins in the underlying ConditionalFreqDist are passed to the estimator as an argument. :type estimator_args: (any) :param estimator_kwargs: Extra keyword arguments for the estimator :type estimator_kwargs: (any) >>> Classes Python 2.7.5+ (default, Sep 19 2013, 13:48:49) [GCC 4.8.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Complex import Complex >>> x = Complex(3.0, -4.5) >>> x.r, x.i (3.0, -4.5) >>> x (3.0, -4.5j) >>> y = Complex(2.5, 1) File Complex.py: >>> y (2.5, 1j) class Complex: >>> x.add(y) def __init__(self, realpart, imagpart): >>> x self.r = realpart (5.5, -3.5j) self.i = imagpart >>> def add(self, other): self.r = self.r + other.r self.i = self.i + other.i def __repr__(self): return "(" + str(self.r) + ", " + str(self.i) + "j)" NLTK Python 2.7.5+ (default, Sep 19 2013, 13:48:49) [GCC 4.8.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import nltk >>> nltk.download('gutenberg') [nltk_data] Downloading package 'gutenberg' to [nltk_data] /home/fredrik/nltk_data... [nltk_data] Unzipping corpora/gutenberg.zip. True >>> nltk.download('brown') [nltk_data] Downloading package 'brown' to /home/fredrik/nltk_data... [nltk_data] Unzipping corpora/brown.zip. True >>> nltk.download() showing info http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml True >>> NLTK: N-gram Python 2.7.5+ (default, Sep 19 2013, 13:48:49) [GCC 4.8.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from nltk.corpus import brown >>> data = brown.sents(categories='news') >>> data = data[:len(data)/2] >>> from nltk.model import NgramModel >>> from nltk import MLEProbDist >>> trigram = NgramModel(3, data, estimator=MLEProbDist) >>> Getting the data def Pml(ngram, word, context): context = tuple(context) if (context + (word,) in ngram._ngrams) or (ngram._n == 1): fd = ngram[context].freqdist() i = float(fd[word]) N = float(fd.N()) return i/N else: return 0 fd = ngram[context].freqdist() i = float(fd[word]) N = float(fd.N()) Get the data container Number of occurrences Number of words??? Links and resources The internet is full of good resources: http://docs.python.org/2/tutorial/ http://pyvideo.org/ Book: Python Essential Reference, David M. Beazley