July 1994/Code Capsules

Code Capsules

C++ Exceptions

Chuck Allison

In last month's article I discussed the various control structures available in C: conditions, loops, non-local jumps, and signals. I showed how the setjmp/longjmp facility (non-local jumps) allows you to specify an alternate return from a function to handle exceptional conditions. But jumping out of a function into a context higher up in the call chain can be risky in C++. The main problem, as Listing 1 illustrates, is that the automatic variables created in the execution path between the setjmp and longjmp may not be properly destroyed. Fortunately C++ provides an alternate return method that know how to invoke the appropriate destructors, via the C++ exception-handling mechanism.

Stack Unwinding
The program in Listing 2 is a rewrite of Listing 1 using exceptions. You turn on exception handling in a region of code by surrounding it with a try block. You place exception handlers immediately after the try block. An exception handler is like a function definition without a name, and is introduced by the keyword catch. You raise an exception with a throw expression. The statement
throw 1;
in Listing 2 causes the compiler to search for a handler that can catch an integer exception. (Exceptions are C++ objects, though they usually don't have data or function members. Their main purpose is to transmit type information from one part of the program to another.) The ellipsis specification (catch(...)) can catch any type, so control will pass to the handler in main(). But before control passes to this handler the program invokes destructors for all automatic objects constructed since execution entered the try block. This process of destroying automatic variables on the way to an exception handler is called stack unwinding.
As you can see, executing a throw expression is semantically similar to calling longjmp, and a handler is like a setjmp. Just as you cannot longjmp into a function that has already returned, you can only throw to a handler associated with an "active" try block, i.e., one that execution control has not yet exited. After a handler executes, control passes to the first statement occurring after all the handlers defined in the source text (which is return 0 in this case).
The notable differences between exceptions and the setjmp/longjmp facility are:
1) Exception handling is a language mechanism, not a library feature. The overhead of passing control is invisible to the programmer.
2) The compiler generates code to keep track of all automatic variables with destructors and to execute those destructors when necessary (i.e, it unwinds the stack).
3) Local variables in the function that contains the try block are "safe" — you don't need to declare them volatile to keep them from being corrupted as you do with setjmp.
4) The program finds handlers by matching the type of the exception object thrown. This methodology allows you to handle categories of exceptions with a single handler, and to classify them via inheritance (see "Grouping Exceptions" below).
5) Exception handling is a run-time mechanism. You can't always tell which handler will catch a specific exception by examining the source code.
The last point is significant. C++ exception handling allows a clean separation between error detection and error handling. A library developer may create code to detect when an error occurs, such as an argument out of range, but won't know how to handle the error. You, the user, can't detect an error condition in a library function, but you know how your application needs to handle one, should it occur. Hence exceptions constitute a protocol for run-time communication between components of an application.
It is also important to realize that C++ exception handling is designed around the termination model. That is, when an exception occurs, there is no direct way to get from the handler back to the throw point and resume execution where you left off, as you can in languages like Ada that obey the resumption model. C++ exceptions are for rare, synchronous events. (In this context, a synchronous event is one that your program causes. This is as opposed to an asynchronous event, which occurs due to forces your program can't control, such as the user pressing the attention key.)

Catching Exceptions
Since exceptions are a run-time and not a compile-time feature, the standard working paper specifies the rules for matching exceptions to catch-parameters a little differently than those for finding an overloaded function to match a function call. You can define a handler for an object of type T in one of the following ways (the variable t is optional, just as it is for functions in C++):

catch(T t) catch(const T t) catch(T& t) catch(const T& t)
Such a handler can catch exception objects of type E if:
1) T and E are the same type, or
2) T is an accessible base class of E at the throw point, or
3) T and E are pointer types and there exists a standard pointer conversion from E to T at the throw point.
T is an accessible base class of E if there is an inheritance path from E to T with all derivations either public or protected, or if E is a friend of T. To understand (3), let E be a type pointing to type F, and T a type that points to type U. Then there exists a standard pointer conversion from E to T if

T is the same type as E except it may have added any or both of the qualifiers const and volatile

T is a void *

U is an unambiguous, accessible base class of F. U is an unambiguous base class of F if F's members can refer to members of U without ambiguity (this is usually only a concern with multiple inheritance).
The bottom line of all these rules is that exceptions and catch parameters must either match exactly, or the exception must be derived from the type of the catch parameter. For example, in the following, the exception thrown in F is not caught:

#include <iostream.h> void f(); main() { try { f(); } catch(long) { cerr << "caught a long" << endl; } return 0; } void f() { throw 1; // not a long! }
When the system can't find a handler for an exception, it calls the standard library function terminate(), which by default aborts the program. You can substitute your own termination function by passing its address as a parameter to the set_terminate() library function.
In the following the exception thrown in F is caught, since there is a handler for an accessible base class:

#include <iostream.h> class B {}; class D : public B {}; void f(); main() { try { f(); } catch(B&) { cerr << "caught a B" << endl; } return 0; } void f() { throw D(); }

Grouping Exceptions
Last month's program for deleting directories appears in Listing 3 using character-string exceptions instead of setjmp and longjmp. Note that the char * handler uses its parameter to print out an appropriate message. If your application uses a large number of exceptions, you may prefer to use different classes for the different types of exceptions. The program in Listing 4 defines the following class hierarchy:
Dir_err
     Bad_dir
     Dir_open_err
     File_del_err
     Dir_del_err
Although this program does not throw a Dir_err exception explicitly, it is a good idea to have a Dir_err handler anyway. Then if you add a new exception, Dir_read_err, say, in the future, but forget to add a corresponding handler, the program will still catch such exceptions as long as you derive Dir_read_err from Dir_err. Be sure to place the handler for base-class exceptions after the definitions of all the derived-type handlers. When you throw an exception, the run-time mechanism searches the handlers in source-code order for a match. If you place a base-class handler such as Dir_err in front of the derived-type handlers, Dir_err will catch all exceptions of derived type, rendering any subsequent handlers unreachable.
Since some errors fit naturally into more than one category, exception hierarchies provide a meaningful use for multiple inheritance, for example:
class Disk_write_error : public Disk_error,
                     public Write_error
{};
Exception classes are mainly used to classify the type of exception being thrown; they usually carry very little functionality or data. Hence, users of exception classes can enjoy the benefits of multiple inheritance, without the complications such as ambiguous references to inherited members.

Standard Exceptions
In March 1994 the standards committee decided that the standard C++ library will only throw exceptions derived from the following hierarchy:

exception logic bad-cast domain runtime range alloc
Logic exceptions are due to errors in the internal logic of a program. Domain errors are a type of logic error that violate the preconditions of a function, for example, by using an out-of-range index or attempting to use a bad file descriptor. As another example, the bit-string class from the standard library throws an out-of-range exception (derived from domain despite its name) if you ask it to set a bit that doesn't exist:

bitstring& bitstring::set(size_t pos, int val ) { if (pos >= nbits_) out-of-range ("invalid position").raise(); set_(pos,val ); return *this; }
The library throws all exceptions by calls to the member function exception::raise().
The bad-cast exception class pertains to C++'s run-time type identification facility (RTTI), and is outside the scope of this article.
Run-time errors are those that you cannot easily predict, either in theory or by careful analysis of source code, and are usually due to forces external to a program. A range error is a run-time error that violates a postcondition of a function, an example being arithmetic overflow from valid arguments. The other kind of run-time exception is the alloc exception, which occurs when heap memory is exhausted (see "Memory Management" below).
The program in Listing 5 redefines the directory exceptions of Listing 4 to fit into the standard exception hierarchy. The directory errors are clearly run-time exceptions, since they depend on the return status of system services, not errors in the code. The exception base class provides a member function what(), which returns the string argument that created the exception. I use this function to pass information about the error to the excetion handler.
If you have Borland C++ 4.0, you will need to insert the text of Listing 6 somewhere near the end of the standard header file <except.h> to make the new standard exception names available. Even though vendors like Borland and Watcom faithfully track the progress of the standards committee, their current versions were released before the standards committee redefined the standard names in March.

Resource Management
While stack unwinding takes care of destroying automatic variables, there may still be some cleanup left to do. As the program in Listing 7 shows, the file opened in f will not get closed if you throw an exception in the middle of f. One way to guarantee that a resource will be deallocated is to catch all exceptions in the function where the deallocation takes place. In Listing 8 the handler in f closes the file and then passes whatever exception occurred on up the line by rethrowing it (that's what throw without an argument does). This technique could become tedious, however, if you use the same resource in many places.
A better method arranges things so that stack unwinding will deallocate the resource automatically. In Listing 9 I create a local class File whose constructor opens the file and whose destructor closes it. Since the File object x is automatic, its destructor is guaranteed to execute as the stack unwinds. You can use this technique, which Bjarne Stroustrup calls "Resource Allocation is Initialization," to safely handle any number of resources.

Constructors and Exceptions
One of the motivations for adding exceptions to the language was to compensate for the constructors' inablity to return a value. Without benefit of exceptions, handling resource errors that occur inside of constructors becomes very awkward. A typical practice is to use an internal state variable and test it before using the resource. The program in Listing 10 uses the file pointer returned from fopen to determine if the resource is available or not. The operator void*() member function returns a non-zero pointer value if all is well, so you can test the state like this:
if (x)
   // All is well
else
   // Resource unavailable
(In the future, a standard-conforming compiler will allow you to use an operator bool member function for this purpose.)
Having to test an object before each use can be quite tedious. As Listing 11 illustrates, it may be more convenient to throw an exception directly from the constructor.

Memory Management
Before exceptions, if a new operation failed to allocate an object it returned a null pointer, just like malloc does in C:

T *tp = new T; if (tp) // Use new object
The standard C++ working paper now stipulates that a memory allocation failure should result in an alloc exception. The standard library as well as third-party libraries may make extensive use of heap memory, so a memory allocation request can occur when you least expect it. Therefore, most any program you write should be prepared to handle an alloc exception.
The obvious way to handle an alloc exception is to supply an alloc handler:

catch(alloc) { cerr << "Out of memory" << endl; abort(); }
Depending on your application, you may be able to do something more interesting than this, such as recover some memory and try again.
Another way of handling out-of-memory conditions is to replace parts of the memory allocation machinery itself. When execution encounters a statement such as

T *tp = new T;
it calculates the amount of storage required (say, N) and calls the standard library function operator new(size_t) (with an argument of N). If N bytes are not available, operator new() calls the new handler, an unnamed function that by default throws an alloc exception. You can either define your own operator new(size_t) to displace the default version, or you replace the default new handler with a function of your own. A typical custom new handler tries to free up some memory and returns, allowing operator new() to try again. If you can't come up with any more memory, then you should either throw an alloc exception or abort the program.
To install a function, nh(), as a new handler, pass it to the standard library function set_new_handler, which returns a pointer to the currently-installed handler, in case you want to restore it later:

void nh(); void (*old_handler)(); // Install custom new-handler old_handler = set_new_handler(nh); ... //do some processing here ... // Restore the old one set_new_handler(old_handler);
If you want your program to revert to the traditional behavior, where new returns a null pointer, call:

set_new_handler(0);
[This is not gauranteed by the draft C++ Standard, but is recommended as a "common extension" — pjp]

Exception Specifications
You can enumerate the exceptions that a function will throw with an exception specification:

void f() throw(A,B) { // Whatever }
This definition states that while f is executing, only exceptions of type A or B will be thrown. Besides being good documentation, this run-time system feature ensures that only the allowable types of exceptions occur. In the event of any other exception, control passes to the standard library function unexpected, which by default terminates the program. The definition of f above is equivalent to:

void f() { try { // Whatever } catch(A) { throw; // rethrow } catch(B) { throw; // rethrow } catch(...) { unexpected(); } }
You can provide your own unexpected handler by passing its address to the standard library function set_unexpected. The definition

void f() throw { // Whatever }
disallows any exceptions while f is executing; that is, it is equivalent to:

void f() { try { // Whatever } catch(...) { unexpected(); } }
The challenge posed by exception specifications is that a function such as f may call other functions that throw other exceptions. You must know what exceptions can be thrown by those functions and either include them in the specification for f, or handle them explicitly within f. This situation creates a problems if f calls services whose future versions add new exceptions. If those new exceptions are derived from A or B, you don't have a problem, but that is not likely to happen with commercial libraries. The standards committee is still debating on whether to use exception specifications in the standard library or not.

Afterword
Perhaps the most important thing to say about exceptions is that you should only use them in truly exceptional circumstances. Like setjmp and longjmp, they interrupt your program's normal flow of control. Your code should contain relatively few exception handlers compared to its number of functions. In addition, exceptions have been designed for synchronous events only; don't mix exceptions and asynchronous events, such as signals.