Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMPUTING SCIENCE & MATHEMATICS Computing Science Examination Autumn Semester 2014 ITNPD2: Introduction to Big Data Date 1.5 hour exam Attempt TWO questions out of THREE. All questions carry equal marks. The distribution of marks among the parts of each question is indicated. IMPORTANT NOTE Read the instructions on the front of each answer book carefully. It is essential that you write your student number on the front of each answer book. Also, when you have completed the examination, the number of answer books that you have used must be prominently written on the front of one book. 2 ITNPD2 XX December 2014 QUESTION 1 1 (a) (i) Python is described as a scripting language. It is interpreted (rather than [2] compiled), and is dynamically typed (rather than statically typed). Give a brief explanation of these terms. (ii) Python has been adopted by the Big Data community as a “gluing [2] language”, what does this mean, and why is a scripting language suitable for this task over other programming languages such as Java (a compiled language). (b) What does each of the following four print output to the screen? (i) (ii) (iii) (vi) [4] string1 = "big data" print string1[3] print string1[4:] print string1[4:-1] string2 = "B" + string1[1:-1] + "A" print string2 (c) Imagine you are given a function called isnumeric which takes a string as an argument. It returns True only if all the characters in the string are digits 0- [8] 9, and False otherwise (e.g. isnumeric(“0123”) returns True and isnumeric (“abc”) and isnumeric (“a4b7c”) return False). Instead you want a function that recognizes other numbers too, for example “+8.95” or “-38.4”. Explain how you would use the existing function isnumeric to make a new function which can recognize numbers that may start with a sign (+/-) and may contain a single decimal point. Call this method isANumber. For example, if isANumber is called with “+1”, “1”, “1.0”, or “-1.” it will return True, but called with “1-”, “--1”, or “1.1…” will return False. (d) The following fragment of code asks the user to enter a polynomial in the [4] variable x, then a value for x and calculates the result (note that there are a few bugs in this code). Describe line by line what the following code is supposed to do. For example line 1, what does it print on the screen, what values does it store (and under what variable name), and what type do you think the variable is? Continued/ 3 ITNPD2 XX December 2014 polynomial = raw_input('type in a polynomial e.g. x**2 + 3*x + 1') print "you typed "+polynomial x = raw_input('type in a value for x') y = eval(polynomial) print "the value of "+polynomial+" at the value x = "+x+" is "+y (e) (i) The code in part (d), in its current form will not even run, and will give a [2] number of errors. First identify the errors and then say how you would correct them. Also, cosmetically, is there anything you could do to improve the output? (ii) How can the code be further improved? Specifically the user is asked for a [2] value for x (i.e. a number), but there is nothing stopping the user entering a string e.g. “cat”, in which case the polynomial evaluated at the value “cat” does not make sense. How can the code check for this? Can you see any other problems in the code? [1] (iii) Why is it in general a bad idea to use the commands exec() and eval()? QUESTION 2 2 (a) Python has incorporated a number of features from functional [1] programming. Describe these (higher order functions and partial functions, anonymous functions), giving an example of each to support your answer. (b) What do the following four print statements output to the screen? print filter(lambda x: x % 2, [0,1,1,2,3,5,8,13,21,34,55]) [8] def double(T): return (2*T ) def triple(T): return (3*T) print map(triple, map(double, [2,4,6,8])) print map(double , map(triple, [2,4,6,8])) def f(a,b): if (a > b): return a else: return b print reduce(f, [1,3,5,4,2]) Continued/ 4 ITNPD2 XX December 2014 With the final fragment of code, what does f compute? What does f compute when combined with reduce compute? (c) Let us suppose we want to calculate the average of a list of numbers. We [8] need to do two things, find the total of the list and find the length of the list. Using reduce, how do we find a. The total of a list b. The length of a list An outline of the functions is given below, what does “???” need to be replaced with (there are two in each function) def totalList(list1): return reduce(???, list1, ???) def lengthList(list1): return reduce(???, list1, ???) (d) Given a file containing the following lines of code 1 ONE one' 2 two TWO' 3 three THREE' 4 four FOUR' 5 five FIVE' 6 six SIX' 7 seven SEVEN' 8 eight EIGHT' 9 nine NINE' 10 ten TEN' (i) Which lines of code do the following regular expressions match? i. ii. iii. iv. [5] "^.........$" (please note that there are 9 dots) "[ou]" "^..[fs]" " (a|e|i|o|u)" (please note that there is a space at the start) v. "^(..)(..)+$" (ii) Write a regular expression captures each of the following: [3] 1. A odd number of characters 2. lines that contain the “n” character 3. lines that do not contain the “n” character Continued/ 5 ITNPD2 XX December 2014 QUESTION 3 i. A RESTful API is often used to provide access to data. 3 a. Describe what a RESTful API is, including examples of situations in which it would be used, examples of real applications that use it and example code (you can invent variable names and values). [3] b. What are the advantages of a RESTful API over a programming language specific API? [2] c. Name one advantage and one disadvantage of accessing data via an API over simply downloading an entire dataset ii. JSON is a flexible method of representing data. Write JSON code, using the correct notation (including brackets and quotation marks) and key names of your own choosing to describe the following objects: a. A customer whose ID is 4345 and whose name is John Brown b. A customer whose ID is 4345 and who has specified contact preferences as email, phone and fax c. A customer whose ID is 4345 and who has bought a product that has a product ID of 322 and the colour Red iii. iv. [2] XML and tabular data (such as CSV) are alternative storage formats. Discuss the relative merits of XML, tabular data and JSON. [1] [2] [2] [8] The programming language Python has a JSON library. Describe how you would use this library to load JSON encoded strings into Python data structures and then extract the values associated with known keys in the JSON [5] object. Give the Python code you would use to do each of these two things. Continued/