Code Capsules

Variable-Length Argument Lists

Chuck Allison


If printf is not the most widely used function in the standard C library, it is certainly the most flexible. printf, and its companions sprintf and fprintf, are the workhorses of formatted output. Their ability to process a variable number of arguments makes these functions very useful indeed. The format string in the statement

printf("%d, %s\n",n,s);
tells printf that it must extract an int and then a pointer to char from the argument space (usually the program stack). But how does printf get at the optional arguments? Well, I can't show you the inner workings of printf, but I can show you the mechanism C provides to handle variable-length argument lists. In this article I will show how you can write functions of your own that accept a variable number of arguments, and why you might want to do so.

The Ellipsis Prototype Specification

printf's function prototype in your compiler's stdio.h should look something like

int printf(const char *, ...);
The ellipsis tells the compiler to allow zero or more arguments of any type to follow the first argument in a call to printf. The format string communicates the number and type of the caller's optional arguments to the printf function. For printf to behave correctly, the arguments that follow the format string must match the types of the corresponding edit descriptors in the format string. If the argument list contains fewer arguments than the format string expects, the results are undefined. If the argument list contains more arguments than expected, printf ignores them. The bottom line is that when you use the ellipsis in a function prototype you are telling the compiler not to type-check your optional arguments because you think you know what you're doing — so be sure that you do.

Variable-Length Argument Lists from Scratch

It's not too difficult to write your own functions that will accept a variable number of arguments. The program in Listing 1 extracts the maximum of a list of integers. It uses a fixed integer argument to communicate the number of elements in the list. The program assumes that the list begins immediately after the integer n in memory, that is, at address &n + 1.

The program in Listing 2 extends this technique to lists of mixed types. Since the pointer p must visit arguments of different size, I have defined it to be a pointer to char. To extract an object of a certain type from the argument space, I just apply the appropriate cast and dereference, as in

s = *(char **) p;
and I use pointer arithmetic to skip over the extracted object:

p + = sizeof s;
The following macros, which automate this process, appear in Listing 3:

#define first_arg(x,p)  p = (char *) &x + sizeof(x)
#define next_arg(p,T,x) x = *(T*)p; p += sizeof(T)
first_arg initializes p with the address of the first argument that follows the fixed argument x. next_arg assigns an object of type T to x and then skips past it.

The va_list Mechanism

The programs in Listing 1 through Listing 3 are okay for illustration purposes, but they aren't portable; they only work on platforms that store arguments linearly in order of increasing address and that leave no holes between arguments. Fortunately, Standard C defines a portable method for processing variable-length argument lists, with the macros defined in stdarg.h (see Listing 4) . The stdarg header defines a new type, va_list ("variable argument list"), which refers to the list of optional, trailing arguments in a function call. The statement

va_start(args,npairs);
initializes args to point to the va_list adjacent to the fixed argument npairs. To extract an object from the list and "advance" to the next, call va_arg, specifying the desired type:

n = va_arg(args,int);
To be completely portable, you must close a va_list with the va_end macro (although on my compiler it is just a no-op). Listing 5 shows how to code a portable version of the maxn function from Listing 1.

As you can see, to use variable-length argument lists you must provide two things:

1) At least one fixed argument (always the last before the ellipsis) to initialize the va_list, and

2) Some mechanism that communicates the number and/or type of arguments to the function.

The following function prototype is both useless and syntactically invalid:

void f(...);
/* Location of args unknown */
There are a number of ways to satisfy point 2). For example, the program in Listing 6 concatenates a variable number of strings into its fixed string argument. The program processes one string after another until it finds a null pointer in the va_list. A call such as

concat(s,NULL);
initializes s to the empty string.

As another example, consider a certain screen interface library that supports data entry tables. The library has a function table_put_row that allows you to fill rows with initial data. For example, if you have defined the columns of a table to represent name, occupation and salary fields, you can populate the table like this:

table_put_row(tp,0,"Sandra", "Executive","57000");
table_put_row(tp,1,"James", "Mechanic","45000");
table_put_row(tp,2,"Kimberly", "Musician","66000");
/* etc. */
where tp is a pointer to a table structure and the second argument is the row number. As you can see in Listing 7, table_put_row doesn't need a parameter specifying the number of field arguments since it can infer that number directly from the Table structure.

va_lists as Arguments

Listing 8 presents a useful function, fatal, which prints a formatted message to stderr and exits a program gracefully. You call it as you would printf, with a format string and a list of parameters, such as

fatal("Error %d on device %d\n", err,dev);
What you would like to do is just pass the format string and print arguments to some function that implements the printf machinery. The C library function vfprintf makes this very easy. All you have to do is initialize a va_list with the print arguments and pass it as the third argument. As you would expect, the C library includes the companion functions vprintf and vsprintf as well.

Why the Fuss?

Programmers coming from most any other language may wonder why we need all of this machinery. For example, the FORTRAN programmer is quite accustomed to writing print statements that give no explicit information as to number or type of arguments in its argument list:

*     Output two numbers:
     PRINT *, x, y
In a FORTRAN statement, to find the maximum of a list of numbers, only the numbers appear:

PRINT *, MAX(1,3,2)
The reason programmers can do this in FORTRAN is that statements such as PRINT and MAX are part of the language (FORTRAN calls them intrinsic functions). The compiler knows their requirements and therefore can supply the appropriate information. In C, on the other hand, there is no input, output, or any other functionality built into the language except for what the operators provide. The C philosophy is to keep the language small and to supply needed functionality with libraries. Since the only communication between libraries and the compiler is the function call mechanism, you must provide a function all the needed information when you call it.

An Application

In financial and other numerical applications you often want to express integers, such as monetary amounts, as groups of numbers separated by commas:

$11,235,852
One approach is to convert the number to a string with sprintf and then traverse the string backwards, copying it to another string and inserting commas as needed. Another approach, which I present here, solves the more general problem of creating strings backwards.

The program in Listing 9 calls a function prepend to build a string backwards. You pass prepend three arguments: the output buffer, an offset which points to the first character in the populated portion of the string, and the string to prepend. After it's finished, prepend returns the new offset.

The following diagrams show the state of s[] after each call to prepend:

Listing 10 presents the implementation of prepend along with a function preprintf, which allows you to prepend strings with formatting. preprintf uses vsprintf to create the formatted string, and then calls prepend to tack it onto the front of the existing string.

I can now implement a function commas, in terms of prepend and preprintf (see Listing 11) . As I extract each digit in turn, moving right to left by the usual remainder and quotient calculations, I push that digit onto a static character buffer, inserting commas where necessary. commas returns a pointer to the beginning of the completed string, which may or may not coincide with the beginning of the buffer. Note that the numeric base and the grouping size are parameterized.

Conclusion

In this article I have illustrated the hows and whys of variable-length argument lists. The stdarg macros are a lot like a parachute — you don't need one very often, but when you do, usually nothing else will suffice. Since these macros involve a relaxation of C's argument type-checking mechanism, be sure to use them with care.