* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CSE 341 - Unit 4
Survey
Document related concepts
Transcript
CSE 341 - Unit 4 - Equivalent Functions [00:00:00.00] [00:00:05.15] The last major topic we will consider in this section is a deeper and more precise understanding of when two functions are the same, or I'll call them equivalent. So this is important to look at carefully. I'm not going to show you any new coding idioms, or clever new things, or language constructs, but it's such a fundamental idea in software engineering that I think it's worth focusing on. [00:00:29.34] We're going to see that you can replace one function with another if you're very careful, and you know what it means for two things to be equivalent. And we'll see that things are more likely to be equivalent when you are using more abstraction, and when you have fewer side effects. If you can assume that other computations don't do things like mutate references or print things out, then additional things are equivalent. [00:00:55.83] So let me motivate why we're looking at this. I believe that developers, people programming, think about equivalence all the time. When you are maintaining code, and you say, oh, I have a nicer style way, I have a nicer way to express this, what you're really saying is to express this same thing or to express an equivalent thing. No one will be able to tell that I did this code cleanup. [00:01:20.54] You are also thinking about this all the time when you're looking at backward compatibility. Can I add new features without changing how this software behaves for any of the old features? It still behaves equivalently for all old possibilities. Code optimization, whether manual, when you're writing the code or if you're implementing a language, is all about equivalence. Can I speed up this code without changing how it behaves on any inputs? And finally, when we were studying our module system, we did this a bit as well. Can an external client tell if I replace this implementation with another implementation? [00:01:57.30] Now, what we're going to look at here is not necessarily related to modules or abstract types. Instead, what we'll do is say here are two functions. Are they equivalent for all possible calls to them? [00:02:09.62] So maybe I'm implementing a library. I don't know all the clients of my library. I might be putting this code up on the internet for people to use. I want to be able to think about could I replace this function with this other function without any possible call to these functions ever being able to tell. That's what equivalence is all about. [00:02:30.65] Now, we need to define what it means for two functions to behave the same way. And I will say that they have the same observable behavior if, given equivalent arguments, they meet all of these bullet points you see are on this slide. [00:02:45.92] Clearly, they need to always return the same answer. If one takes three and returns seven, and the other takes three and returns eight, this is no good, right? But that's not enough. They also have to have the same non--termination behavior. If one of them doesn't terminate on nine, the other needs to not terminate on nine. And similarly, they need to terminate on all the same arguments. [00:03:09.11] They have to have any effects that they have on mutable references that other parts of the program can see be the same. If one of them updates a reference to a different value than the other does, then after the call completes, some other code in the program might be able to tell that you replaced the first function with the second function. They have to have the same input-output behavior. We can't have one printing something and the other not printing the same thing. [00:03:37.02] And they have to raise the same exceptions. We can't have one of them choose to raise an exception in a situation where the other one doesn't. And I may have even forgotten things here, but I think this is a pretty good list, all right. [00:03:48.56] Now, notice it's going to be easier for two functions to be equivalent, if users of these functions cannot use them with as many arguments. For example, if you have a nice, strong type system, that will ensure that-- say these functions only take string * string as arguments, and we don't have to think about, well, what would happen if someone passed an int instead, because no such calls will ever type check. [00:04:14.96] We will also see that it's much easier for two functions to be equivalent, if our language is a functional language where there are fewer ways to make side effects. And sometimes people even assume that you won't make side effects even if the language allows them. I'll show you an example of that in just a second, OK. So let's finish up this segment with a few examples, and then we'll move onto some more universal notions of function equivalence. [00:04:42.35] So on the top here, I have two functions, f, that are equivalent. So the function on the left takes in an argument, returns x plus x. The one on the right takes in an argument and returns y times x. This is defined in an environment where y is two. Both these functions always double their argument. They have no other side effects. They always terminate. They are equivalent in every way. No ML program using the function on the left would ever be able to tell if you swapped that out for the function on the right and vice versa. So that's a good example. [00:05:14.18] Here's an example where you might be surprised that the two functions are not equivalent if we are not careful about our assumptions. So the function on the left, g, takes in a function, f, in an argument, x, and returns f applied to x plus f applied to x. The one on the right multiplies f applied to x by y and y is bound to two. [00:05:36.26] So these will always return the same answer, assuming f is a function that always returns the same thing when given the same argument. But the code on the left calls f twice, and the code on the right calls f once. And that is a problem if f could have side effects. Suppose it increments a reference every time. Well, then the code on the left will set that reference to two more than it used to have, and the code on the right one more. [00:06:02.54] An even simpler example is if f always prints something out. Like this function here. If you pass this for f, it prints out hi and then returns its argument. The code on the left will print hi twice. The code on the right will print hi once. So when you are in a functional programming language, we typically have functions that don't do this sort of thing. And if you assume these things don't happen, then the functions are equivalent. [00:06:27.53] Some languages, like Haskell is a good example, force a notion of pure functional programming. Most functions in the language, those that you can pass to other functions like this, cannot do things like print out. And as a result, the corresponding code in a language like Haskell is equivalent on the left and the right. So this is yet another advantage of avoiding side effects in your code. [00:06:52.10] Let's do one more example. Here are-- it's a few more lines of code, but it's actually very simple what's going on. The function f here on the left assumes that we have in our environment some functions g and h. It calls g with x, calls h with x, and returns a pair of the results. The code on the right is the exact same, except it just calls h and g in the opposite order. So unlike in the previous example there is no number of calls problem here. Both of these call g wants and call h wants. [00:07:19.96] So are they equivalent? Again, if g and h are pure functions-- they don't have side effects, they just compute something and return a result-- then yes they're equivalent. But if g and h can have side effects, then no, not necessarily. [00:07:34.91] Suppose g prints something, and h prints something. Well, then the code on the left will print those outputs in one order. The code on the right will print them in the opposite order. [00:07:45.41] Here's another example. Suppose g sets some mutable reference, and h reads that same mutable reference. Then in the code on the left, h is going to see the new value after whatever g wrote to it. And in the code on the right, h is going to see the value before g does its right, because h executes before g executes. [00:08:06.48] So once you have mutations, side effects, printing, we suddenly have to worry about the order we do things. And different orders lead to functions that are not equivalent. But if we stick to a functional style, where we don't write functions with side effects, then we do not have to worry about order. And we can execute these functions in either order with one final caveat that if g of x and h of x both raise exceptions, and raise different exceptions, then the order could matter again. And the code on the left would raise one exception, and the code on the right would raise another exception.