Download Exam 1 Solution – CIS4930 NLP – February 1, 2010 1.[5 pts] Define

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mathematical optimization wikipedia , lookup

Pattern recognition wikipedia , lookup

String (computer science) wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Exam 1 Solution – CIS4930 NLP – February 1, 2010
1.[5 pts] Define and describe what self means within a Python class.
This is the extra parameter at the beginning of the parameter list of any class method; it does not need
to be specified in the method call but is supplied automatically by Python. It refers to the current
instance of the class. It is not required that this parameter be named self – that is just a widespread
(highly recommended) convention.
2. [10 pts] Create the python class Tag. This class will include a string property name, the name of the
tag.
#It is not necessary to supply an init method, but for clarity, one has been provided here
class Tag:
def __init__(self, name):
self.name = name
3. [15 pts] Create the Python class HTML Tag, a subclass to TAG. The class will contain a dictionary called
attributes. The keys in the dictionary will be the attribute name. The corresponding value to each key
will be the value of the attribute. In addition to the class structure, provide methods: search and
update. Search will return the value of a key passed to the method. Update will receive a key and a
value. It will update (or insert) the value at the key given.
class HTMLTag(Tag):
def __init__(self, name):
Tag.__init__(self, name)
self.attributes = {}
# It would be good to check for the null value case, but was not required here.
def search(self, key):
return self.attributes[key]
def update(self, key, value):
self.attributes[key] = value
4. [10 pts] Describe the class of strings matched by these Python regular expressions.
a. r’[A-Z] {3} \d{4}’
Any string of three capital letters followed by four digits, i.e. a course number (eg. CIS4930).
b. r’\$\d+(\.\d{2})?’
Any string that contains a dollar sign, one or more digits, and then, optionally, a decimal point
followed by exactly two digits, i.e. a dollar amount (eg. $342.12).
5. [10pts] Create a Python regular expression to match this class of strings: A sentence with words
(containing only letters), spaces and ending in a question mark, exclamation point, or period(?, !, or .).
r'[A-Za-z ]+[\?!\.]' (There are many possible correct answers)
6. [10pts] Create the Python code to compile the regular expression searching for words, spaces, and
punctuation (?, !, .) in the prior question and then determine all of these occurrences within the string
text.
p = re.compile('[A-Za-z ]+[\?!\.]')
p.findall(text)
7. [20pts] Create the Python function acronym. The function will receive a string and return a string.
The returned string will be the acronym of the string passed into the function. The acronym will be
composed of only the first letter of each word. Each letter of the acronym will be capitalized and there
will be no spaces between the letters of the acronym. For example, “Computer Information Science
Engineering” will become “CISE” and “what you see is what you get” will become “WYSIWYG”.
# There are many possible correct solutions to this problem.
def acronym(text):
words = text.split()
output = ""
for word in words :
output += word[0]
output = output.upper()
return output
8. [10pts] Create the python code to print to the screen the acronym of each line from the file:
http://www.cise.ufl.edu/~pjd/data.txt.
import urllib
handle = urllib.urlopen('http://www.cise.ufl.edu/~pjd/data.txt')
for line in handle :
print acronym(line)
9. [20pts] Create the Python function decreasing. The function will receive a list and return a list. The
function will sort words of the list in decreasing order of the character length. Ties are resolved by
listing duplicate length elements in succession in their original order from the list. For example if the list
[‘This’, ‘is’, ‘a’, ‘test’] is received, the function will return [‘This’, ‘test’, ‘is’, ‘a’]. Note how ‘This’ precedes
‘test’ in the result, as it did in the original.
#There are many possible correct solutions to this problem – any stable sort algorithm can be used
# Comparison function
def cmpLength(string1, string2):
if len(string1) > len(string2) :
return -1
elif len(string2) > len(string1) :
return 1
else :
return 0
# Sorting function
def decreasing(inputList):
inputList.sort(cmpLength)
return inputList