Code Capsules

The Standard C Library, Part 2

Chuck Allison


Last month I divided the fifteen headers of the Standard C Library into three Groups, each representing different levels of mastery (see Table 1, Table 2, and Table 3) . I continue this month by exploring Group II.

Group II: For the "Polished" C Programmer

<assert.h>

Well-organized programs provide key points where you can make assertions, such as "the index points to the next open array element." It is important to test these assertions during development and to document them for the maintenance programmer (which, of course, is often yourself). ANSI C provides the assert macro for this purpose. You could represent the assertion above, for example, as

   #include <assert.h>
   . . .
   assert(nitems < MAXITEMS && i
         == nitems);
   . . .
If the condition holds, all is well and execution continues. Otherwise, assert prints a message containing the condition, the file name, and line number, and then calls abort to terminate the program.

Use assert to validate the internal logic of your program. If a certain thread of execution is supposed to be impossible, say so with the call assert(0), as in:

switch (color)
{
    case RED:
        . . .
    case BLUE:
        . . .
    case GREEN:
        . . .
    default:
        assert(0);
}
assert is also handy for validating parameters. A function that takes a string argument, for example, could do the following:

   char * f(char *s)
   {
       assert(s);
       . . .
   }
Assertions are for logic errors, of course, not for run-time errors. A logic error is one you could have avoided by correct design. For example, no action on the user's part should be able to create a null pointer — that's clearly your fault, so it is appropriate to use assert in such cases. On the other hand, a run-time error, such as a memory failure, requires more bulletproof exception handling.

When your code is ready for production, you should have caught all the bugs, so you should turn off assertion processing. To do so, you can either include the statement

   #define NDEBUG
in the beginning of the code, or define the macro on the command line if your compiler allows it (most use the -D switch). With NDEBUG defined, all assertions expand to a null macro, but the text remains in the code for documentation.

<limits.h>

By definition, portable programs do not depend in any way on the particulars of any one environment. Even assuming that all bytes consist of eight bits is not safe. The header <limits.h> defines the upper and lower bounds for all integer types (see Table 4) . The program in Listing 1 toggles each bit in an integer on and off. It uses the value CHAR_BIT, defined in <limits.h>, as the number of bits in a byte, to determine the number of bits in an integer. As Listing 2 illustrates, you can also use <limits.h> to determine the most efficient data type to use for signed numeric values that must span a certain range.

<stddef.h>

The header <stddef.h> defines three type synonyms and two macros (see Table 5) . When you subtract two pointers which refer to elements of the same array (or one position past the end of the array), you get back the difference of the two corresponding subscripts, which will be the number of elements between the pointers. The type of the result is either an int or a long, whichever is appropriate for your memory model. <stddef.h> defines the appropriate type as ptrdiff_t.

The sizeof operator returns a value of type size_t. size_t is the unsigned integer type that can represent the size of the largest data object you can declare in your environment. Usually an unsigned int or unsigned long is sufficient to represent this size. size_t is usually the unsigned counterpart of the type used for ptrdiff_t. If you look through the headers in the Standard C Library, you'll find extensive use of type size_t. It is good idea to use size_t for all array indices and for pointer arithmetic (i.e., adding an offset to a pointer), unless for some reason you need the ability to count down past zero, which unsigned integers can't do.

The type wchar_t holds a wide character, an implementation- defined integral type for representing characters beyond standard ASCII. You define wide character constants with a preceding L, as in:

    #include <stddef.h>
    wchar_t c = L'a';
    wchar_t *s = L"abcde";
As Listing 3 illustrates, my environment defines a wide character as a two-byte integer. This coincides nicely with the emerging 16-bit Unicode standard for international characters (see the sidebar "Character Sets"). The <stdlib.h> functions listed in Table 6 use type wchar_t. Amendment 1, an official addendum to Standard C accepted in 1994, defines many additional functions for handling wide and multi-byte characters. For more detailed information, see P. J. Plauger's columns in the April 1993 and May 1993 issues of CUJ.

The NULL macro is the universal zero-pointer constant, defined as one of 0, 0L, or (void *) 0. It is almost always a bad idea to assume any one of these definitions in a program — for safety, just include a header that defines NULL (stddef.h, stdio.h, stdlib.h, string.h, locale.h) and let the system figure out the correct representation. <stddef.h> is handy when you need only NULL defined in a translation unit and nothing else.

The offsetof macro returns the offset in bytes from the beginning of a structure to one of its members. Due to address alignment contraints, some implementations insert unused bytes after members in a structure, so you can't assume that the offset of a member is just the sum of the sizes of the members that precede it. For example, the program in Listing 4 exposes a one-byte gap in the Person structure after the name member, allowing the age member to start on a word boundary (a word is two bytes here). Use offsetof if you need an explicit pointer to a structure member:

struct Person p;
int *age_p;
age_p = (int*) ((char*)&p
   + offsetof(struct Person, age));

<time.h>

Most environments provide some mechanism for keeping time. time.h provides the type clock_t, a numeric type that tracks processor time (see Table 7) . The clock function returns an implementation-defined value of type clock_t that represents the current processor time. Unfortunately, what is meant by "processor time" varies across platforms, so clock by itself isn't very useful. You can, however, compare processor times, and then divide by the constant CLOCKS_PER_SEC, thus rendering the number of seconds elapsed between two points in time. The program in Listing 5, Listing 6, and Listing 7 uses clock to implement such stopwatch functions.

The rest of the functions in <time.h> deal with calendar time. The time function returns a system-dependent encoding of the current date and time as type time_t (usually a long). The function localtime decodes a time_t into a struct tm (see Listing 8) . The asctime function returns a text representation of a decoded time in a standard format, namely

   Mon Nov 28 14:59:03 1994
For more detail, see the Code Capsule "Time and Date Processing in C," CUJ, January 1993.

I'll conclude this series on the Standard C Library next month with a discussion of the functionality of the headers in Group III.