Correctness and testing
If we take the long view, programmers have–to put it gently–a pretty mixed track record vis-a-vis bugs; writing bug-free software is really hard! At the same time, it’s quite important to ensure that compilers are correct. If your compiler has a bug, it could impact anyone writing programs in your language and is likely to be very hard for end-users to track down. So: what can we do to make sure that our software is correct?
We have a number of options, all of which have been applied (to varying degrees of success) to compilers:
- Hand-written correctness arguments
- Testing on particular programs
- Testing different compilers against each other
- Formal verification
For this class, our approach is going to be as follows. As we grow our language, we’re going to add features both to our compiler and to a definitional interpreter. This is an interpreter for our language where we don’t care about performance at all; all we care about is that it’s correct. Indeed, this interpreter will serve as the definition of our language.
Here’s such an interpreter for the language we have so far:
let rec interp_exp (exp : s_exp) : int = match exp with | Num n -> n | Lst [Sym "add1"; arg] -> (interp_exp arg) + 1 | Lst [Sym "sub1"; arg] -> (interp_exp arg) - 1 | _ -> raise (BadExpression exp) let interp (program : string) = string_of_int (interp_exp (parse program))
Notice that our interpreter has a pretty similar structure to our compiler. We’ll see this throughout the semester.
Now that we have an interpreter, we can test our compiler against it:
let difftest (examples : string list) = let results = List.map (fun ex -> (compile_and_run ex, interp ex)) examples in List.for_all (fun (r1, r2) -> r1 = r2) results let test () = difftest ["43"; "(add1 (add1 3))"; "(sub1 4)"]