Assertions (by alaric)
Assertions are a useful tool in defensive programming. If you're writing a bit of code that assumes some condition to be true, since that condition being true is a design constraint of your system so it can only possibly be untrue if there is a bug elsewhere in your code, then it can help to put an explicit assert of that condition before your code.
For example, if you have a program that handles the progress of patients through a suite of operating theatres in a large hospital, you might have a Person
object that has a theatre
pointer, which is NULL
when the person is not in a theatre, or contains a reference to a Theatre
object when the person is in theatre.
Clearly, some methods of Person
might only make sense while that person is in a theatre, and will dereference the theatre
pointer. If they just assume they are only called when the person is in theatre, and for some reason they are called when the person is not, then you'll get a segmentation violation in C++ or a NullPointerException
in Java, or its equivalents in other languages with checked references.
Which is a little unfriendly; much better if you make an up-front assertion upon entry to your methods that theatre != NULL
. That makes it clear where the problem is - a method that needs theatre
to be non-NULL
has been called when theatre
is NULL
; your code still dies - with a SIGABRT
in C-based languages or some kind of assertion-failure exception - but now the source of the exception, found in the stacktrace, points to much closer to where the problem lies (bear in mind that the method might not directly reference theatre
itself but call other methods that eventually do, meaning that the actual problem - the call of a method in a context it shouldn't be called in - might be anywhere in the backtrace, and thus hard to find).
An assertion is usually implemented in the output code as something along the lines of if (!condition) { abort() }
, where abort()
is either "kill the process and force a core dump" or "throw an assertion failure exception"; so at run time, the condition is checked.
But in principle, a smart compiler could examine assertions and use them as input to a compile-time static analysis of the program. If a method starts by asserting that some field of the object is not NULL
, why not look through all the other methods and see which ones assign to that field, and see if we can statically deduce the state of the field in cases where the method is called? This can have several beneficial effects:
- The compiler might find bugs: if it sees you calling the method in a context where the field could well be
NULL
at run time, it can point out how that field could beNULL
in that context, thus revealing a bug. - The compiler might be able to optimise code; if it can prove that the field will never be
NULL
it can skip the conditional in the assertion, and after the assertion it will know that the field cannot possibly beNULL
(please, let's keep multithreading out of this for now...) and so can skip checks forNULL
in the generated object code, thus speeding up execution. - The presence of assertions can simplify the compiler's deductions concerning the state of the system used for checking other assertions. For example, if every method of the class and the constructor all assert that the field is not
NULL
before they return, then without needing to do potentially undecidable analysis of the bodies of the methods, we can know that the field will never byNULL
on entry to any method (although it could becomeNULL
temporarily during the body of a method).
This is really a generalisation of type inference; in a dynamically typed language we might assert that certain expressions have certain types at certain points in the program, but we might make different assertions for the same field in different contexts, expressing how the type might change; and as well as simple type declarations - foo
is an integer - they can also express arbitrary relationships - foo
is less than the sine of bar
.
So it might be interesting to construct a static semantic inferencing system for a programming language that's driven entirely by assertions. Since the assertions might be arbitrary expressions, and thus not necessarily statically (or even dynamically!) decidable, it would have to be an inferencing system that chooses when to give up trying to prove something and makes a conservative assumption, dropping back towards run-time dynamic checks for things it can't prove at compile time - in effect, making a best-effort to deduce what it can in advance as a form of optimisation; one that optimises for execution time and space by removing explicit checks and type tags, and that optimises for programmer time by performing compile-time bug detection!
But I wouldn't implement such a thing directly in terms of an assert
construct in your base language. Better to still implement assert
as a macro that expands to if (!condition) { abort() }
, and make that abort()
into a special construct that means "This should not happen".
Then the static analyser becomes a front-end to a compiler, that takes a dynamically typed language at the input and outputs a type-safe program in a simpler type-unsafe language (by which I mean, one where you can attempt to add a floating point number to a pointer and the system will happily return a value that is neither a useful pointer nor a useful floating point number, without complaining, such as C or FORTH) that a backend compiler then compiles into something useful.
The analyser would compile x+y
into a direct integer addition of x
and y
if it could deduce they're both integers in all possible contexts, and if not, it would compile it into code that has explicit checks for the type of x
and y
, branching off to appropriate addition functions; and its goal would be to reduce the amount of such testing it generates as much as possible.
Any assertions it can't entirely prove at compile time would have to result in runtime checks - in which case, the abort
construct would indeed compile to code that generates a run-time core dump / backtrace.
But the reason I think that we should make abort
the fundamental construct rather than assert
is purely because it's more general, in a way that will make certain optimisations easier. abort
might appear in places other than assert
s - for example, we might write switch
statements with a default case that abort
s, rather than either prefacing the switch
with assert(x == 1 || x ==2 || x ==3)
(checking the valued switched upon is one of the expected values, but being rather long-winded and having to be manually modified whenever the list of cases in the switch is modified), or adding default: assert(1==2)
to the switch (which produces an ugly and misleading error message; the problem is that x
was not an expected value, not that 1==2
).
Through standard program simplification rules, the compiler can find some bugs simply by managing to eliminate conditionals, thus exposing an abort
to code that is always called - and thus trivially marking that code as illegal.
By Faré, Thu 21st May 2009 @ 4:14 am
Have you seen Spec# ? It basically does that!