* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Powerpoint
Survey
Document related concepts
Transcript
Algorithms Session 7 LBSC 790 / INFM 718B Building the Human-Computer Interface Agenda: Weeks 7 and 8 • Questions • Some useful algorithms • Project • Some useful data structures – Including Java implementations Why Study Algorithms? • Some generic problems come up repeatedly – Sorting – Searching – Graph traversal • Need a way to compare alternative solutions • Reusing algorithms is easy and productive – Focusing on the algorithm reveals the key ideas – Language and interface make reusing code hard Sorting • Given an array, put the elements in order – Numerical or lexicographic • Desirable characteristics – – – – Fast In place (don’t need a second array) Able to handle any values for the elements Easy to understand Insertion Sort • Simple, able to handle any data • Grow a sorted array from the beginning – Create an empty array of the proper size – Pick the elements one at a time in any order – Put them in the new array in sorted order • If the element is not last, make room for it – Repeat until done • Can be done in place if well designed Insertion Sort 90 11 27 31 4 16 11 37 Insertion Sort 90 11 27 31 90 4 16 11 37 Insertion Sort 90 11 27 31 11 90 4 16 11 37 Insertion Sort 90 11 27 31 11 27 90 4 16 11 37 Insertion Sort 90 11 27 31 11 27 31 90 4 16 11 37 Insertion Sort 90 11 27 31 4 4 11 27 31 90 16 11 37 Insertion Sort 90 11 27 31 4 4 16 11 37 11 16 27 31 90 Insertion Sort 90 11 27 31 4 4 16 11 37 11 11 16 27 31 90 Insertion Sort 90 11 27 31 4 4 16 11 37 11 11 16 27 31 37 90 Insertion Sort • Sorting can actually be done in place – Never need the same element in both arrays • Every insertion can cause lots of copying – If there are N elements, need to do N insertions – Worst case is about N/2 copys per insertion – N elements can take nearly N2 operations to sort • But each operation is very fast – So this is fine if N is small (20 or so) Merge Sort • Fast, able to handle any data – But can’t be done in place • View the array as a set of small sorted arrays – Initially only the 1-element “arrays” are sorted • Merge pairs of sorted arrays – Repeatedly choose the smallest element in each – This produces sorted arrays that are twice as long • Repeat until only one array remains Merge Sort 90 11 27 31 11 90 4 16 11 37 Merge Sort 90 11 27 31 11 90 27 31 4 16 11 37 Merge Sort 90 11 27 31 4 16 11 37 11 90 27 31 4 16 Merge Sort 90 11 27 31 4 16 11 37 11 90 27 31 4 16 11 37 Merge Sort 11 27 31 90 11 90 27 31 4 16 11 37 Merge Sort 11 27 31 90 4 11 16 37 11 90 27 31 4 16 11 37 Merge Sort 11 27 31 90 4 4 11 16 37 11 11 16 27 31 37 90 Merge Sort • Each array size requires N steps – But 8 elements requires only 3 array sizes k • In general, 2 elements requires k array sizes – So the complexity is N*log(N) • No faster sort (based on comparisons) exists – Faster sorts require assumptions about the data – There are other N*log(N) sorts, though • Merge sort is most often used for large disk files Computational Complexity • Run time typically depends on: – How long things take to set up – How many operations there are in each step – How many steps there are • Insertion sort can be faster than merge sort – One array, one operation per step – But N*log(N) eventually beats N2 for large N • And once it does, the advantage increases rapidly Divide and Conquer • Split a problem into simpler subproblems – Keep doing that until trivial subproblems result • Solve the trivial subproblems • Combine the results to solve a larger problem – Keep doing that until the full problem is solved • Merge sort illustrates divide and conquer – But it is a general strategy that is often helpful Recursion • Divide and conquer problems are recursive – Solve the same problem at increasing granularity • Construct a Java method to solve the problem – Divide the problem into subproblems – Call the same method to solve each subproblem • Unless the subproblems are trivial – Use the parameters to control the granularity • See this week’s notes page for merge sort example A Recipe for Recursion • First, craft a divide and conquer strategy • Create a non-recursive top-level method – Calls recursive method with initial parameters • In the recursive method: – First solve the problem if it is trivial and return • Be sure you eventually get here! – Otherwise, split the problem and call itself – Combine the results and return them Search • Find something by following links – Web pages – Connections in the flight finder – Winning moves in chess • This may seem like an easy problem – But computational complexity can get really bad – Simple tricks can help in some cases Web Crawlers • Goal is to find everything on the web – Build a balanced tree, sorted by search terms • Start anywhere, follow every link • If every page has 1 kB of text and 10 links – Then 10 levels would find a terabyte of data! • Avoid links that are likely to be uninteresting • Detect duplicates quickly with hashing Outside-In Search • Explore multiple paths between two points – Usually trying to find the best by some measure • Flight finder searches like a web crawler – Every possible continuation of every route • Also search backward from the destination • Assuming 10 departures per airfield: – 3 connections takes Flight finder 10,000 steps – 1 connection twice would take 200 steps How to Win at Chess • The paths are the legal moves – And the “places” are possible board positions • You are seeking to make things better – Your opponent seeks to make things worse • Such “zero sum games” are common – Although many lack chess’ shared information • Any problem structure makes search easier – The trick is to exploit constraints effectively Minimax Searching • Decide how many half-moves to look ahead • Develop a scoring strategy for final positions – Based on piece count and positional factors • Follow a promising path – Helpful to guess the best moves for each side • With several moves available, pick the best – But stop any search that can’t improve things Minimax Search Example Min 0 Max 0 3 0 0 -3 5 -3 3 3 5 Traveling Salesperson Problem • Find the cheapest way of visiting 10 cities – Given the airfare between every city pair – Only visit each city once, finish where you start • There are only 90 city pairs – But there are a LOT of possible tours • The best known algorithm is VERY slow – Because the problem is “NP complete” NP Complete Problems • No “polynomial time” algorithm is known 2 3 4 – Not N , N , N … • Haven’t proved that none exists – But if it does, many hard problems would be easy • Approximate solutions with heuristic methods – Greedy methods – Genetic algorithms – Simulated annealing What’s Wrong With Arrays? • Must specify maximum size when declared – And the maximum possible size is always used • Can only index with integers – For efficiency they must be densely packed • Adding new elements is costly – If the elements are stored in order • Every element must be the same type What’s Good About Arrays? • Can get any element quickly – If you know what position it is in • Natural data structure to use with a loop – Do the same thing to different data • Efficiently uses memory – If the array is densely packed • Naturally encodes an order among elements Linked Lists • A way of making variable length arrays – In which insertions and deletions are easy • Very easy to do in Java • But nothing comes for free – Finding an element can be slow – Extra space is needed for the links – There is more complexity to keep track of Making a Linked List • In Java, all objects are accessed by reference – Object variables store the location of the object • New instances must be explicitly constructed • Add reference to next element in each object – Handy to also have a reference to the prior one • Keep a reference to the first object – And then walk down the list using a loop Linked List Example first Jill Joe Tom Public static main (String[] argv) { Student first; … } Public class Student { int String name; public Student next; } Linked List Operations • Add an element – Easy to put it in sorted order • Examine every element – Just as fast as using an array • Find just one element – May be as slow as examining every element • Delete an element after you find it – Fast if you keep both next and prior links first Jill Linked List Insertion Joe Tom public void insert(String newName) { Student temp = first; boolean done = false; while (!done) { if ((temp.next==null) || (temp.next.name.follows(newName))){ Student new = new Student(name, temp.next); temp.next=new; done = true; } temp = temp.next; }} Trees • Linked list with multiple next elements – Just as easy to create as linked lists – Binary trees are useful for relationships like “<“ • Insertions and deletions are easy • Useful for fast searching of large collections – But only if the tree is balanced • Efficiently balancing trees is complex, but possible Binary Tree Example Public class Student { int String name; public Student left; public Student right; } root Joe Jill Tom Data Structures in Java • Resizable array [O(n) insertion, O(1) access]: – ArrayList • Linked list [O(1) insertion, O(n) access, sorted]: – LinkedList • Hash table [object index, unsorted, O(1)]: – HashSet (key only) – HashMap (key+value) • Balanced Tree [object index, sorted, O(log n)]: – TreeSet (key only) – TreeMap (key+value) Hashtables • Find an element nearly as fast as in an array – With easy insertion and deletion – But without the ability to keep things in order • Fairly complex to implement – But Java defines a class to make it simple • Helpful to understand how it works – “One size fits all” approaches are inefficient How Hashtables Work • Create an array with enough room – It helps a lot to guess the right size first • Choose a variable in each object as the key – But it doesn’t need to be an integer • Choose a spot in the array for each key value – Using a fast mathematical function – Best if things are scattered around well • Choose how to handle multiple keys at one spot Java HashMap Class • Hashtables are objects like any other – import java.util.* – Must be declared and instantiated • The constructor optionally takes a size parameter • put(key, value) adds an element • containsKey(key) checks for an element • get(key) returns the “value” object for that key Stacks • Maintain an implicit order – Last In-First Out (LIFO) • Easy additions and deletions – push and pop operations • Maps naturally to certain problems – Interrupt a task to compute an intermediate value • Implemented as a Java class – import java.util.* Choosing a Data Structure • What operations do you need to perform? – Reading every element is typically easy – Other things depend on the representation • Hashing finds single elements quickly – But cannot preserve order • Stacks and linked lists preserve order easily – But they can only read one element at any time • Balanced trees are best when you need both • Which operations dominate the processing?