October 1993/Code Capsules

Code Capsules

Pointers, Part 3: The Rest of the Story

Chuck Allison

In the first two parts of this series, I explored the basics of pointers: indirection, pointer arithmetic, and the relationship between pointers and arrays. All the pointers I've used so far have referred to objects in user memory. In this capsule I'll discuss pointers to functions, how to use pointers to system memory in the IBM PC, and how to implement information hiding with pointers to incomplete types.

Pointers To Functions
A pointer can not only point at stored objects, but at functions as well. The following statement declares fp to be a pointer to a function that returns an int:

int (*fp)();
The parentheses around fp are necessary. Without them, the statement

int *fp();
declares fp as a function that returns a pointer to an int.
If you want to pass arguments, say, a float and a string, to the functions pointed at by fp, then write

int (*fp)(float, char *);
You can then store the address of such a function in fp:

extern int g(float, char *); fp = g;
The name of a function in an expression resolves to an address. You can think of this address as pointing to the beginning of the code for that function (analogous to an array name pointing to its first element). The following "hello, world" program shows how to execute a function through a pointer:

/* hello2.c: Say hello via a function pointer */ #include <stdio.h> main() { void (*fp)() = printf; fp("hello, world\n"); return 0; }
To execute a function via a pointer, you would normally have to write

(*fp)("hello world\n");
to "dereference" the pointer. In fact, this is the way you had to do it in pre-ANSI C, but the ANSI C committee decided to allow the normal function call syntax as shown in the preceding listing hello2.c. Since the compiler knows fp is a pointer to a function, it knows that the only thing you can do under the circumstances is invoke the function, so there is no ambiguity.
When you pass a function name as a parameter to another function, the compiler actually passes a pointer to the function (just like arrays). But when would you ever want to pass a function pointer to another function? One occasion is when you use qsort, the Standard C library sort function. qsort can sort an array of elements of any type, with either simple or compound sort keys. The program in Listing 1 shows how to use qsort to sort command-line argument strings. The program passes to qsort a pointer to function comp, which knows how to compare strings. (For a more in-depth treatment of qsort, see the Code Capsule in the April 1993 issue of CUJ). When a function calls another function through a pointer, and the pointer is determined at run time, this action is called a callback.
An array of function pointers can come in handy in menu-driven applications. Suppose you want to present the following menu to the user:
1) Retrieve
2) Insert
3) Update
4) Quit
The program in Listing 2 uses keyboard input as an index into an array of pointers. Each pointer in the array targets a function designed to process a menu choice. When the user makes a choice, the program calls the function through the indexed pointer.

PC System Memory — Video
By definition, using hard-coded addresses in a program makes it non-portable. The most successful programs in the PC market, however, do this profusely. In order to adapt to a system's configuration, for example, a program must know what type of video adapter you have. The program identifies the adapter by inspecting a certain machine address (0x463) on 100% IBM PC compatibles. Another technique commonly used in commercial PC applications is to write directly to video memory for lightning-fast screen display. In this section, I show how to address video memory directly, in "text" mode.
To understand how video memory works on the PC, you need to know something about memory models. Intel processors have a segmented memory architecture. When a program runs under the small memory model, it has a single 64K data segment, and therefore uses 16-bit pointers. These are called near pointers, which are merely offsets from the beginning of the user data area. In the large memory model, all conventional memory is available, but you refer to it with two 16-bit integers: a segment, and an offset within that segment. In real modes (emulating the original 8086 or 8088), a segment can start at any address divisible by 16 (called a "paragraph" boundary). Therefore, there are 65,536 overlapping 64K segments addressable in a 1 megabyte address space. To find the actual byte address that corresponds to a segment/offset pair, shift the segment left 4 bits (or multiply by 16) to find its true address, and then add in the offset. A large model address, therefore, requires only 20 bits and fits nicely into a long, with room to spare. A pointer that holds such an address is called a far pointer. For every far address, there are many combinations of segments and offsets that resolve to that address (4096, to be exact). For example, you can represent the byte address 0x463 by the following sample segment/offset pairs (numbers are in hexadecimal):

0000:0463 0040:0063 0046:0003 /* Normalized address (offset < 16) */
Most of the time you don't need to worry about this stuff; you can just dereference the pointers and everything will be okay.
Video memory begins at segment number 0xb000 for monochrome systems and 0xb800 for color adapters. To determine which type of system you have, query the word at address 0000:0463, as I have done in the following:

#include <dos.h> /* I'm using Borland C here */ static char far *base; . . . /* Find out where video memory is */ unsigned far *vptr = MK_FP(0,0x463); if (*vptr == 0x3b4) base = MK_FP(0xb000,0); /* Monochrome */ else base = MK_FP(0xb800,0); /* Color */ . . .
The far keyword is not necessary in large model, but a program that uses it will work in both models. The macro MK_FP(segment,offset) takes a segment and offset and combines them into the appropriate 20-bit pointer, cast as a long.
Another feature of video memory to keep in mind is that it is interleaved; that is, each display character is followed immediately by its attribute, an 8-bit code that determines color, boldness, blink state, etc.:

The numeric values are byte offsets within either segment b800 or b000. The address of each character on the screen, therefore, maps to the standard 25-line color display as follows:

b800:0000 b800:0002 ... b800:009e . . . . . . . . . b800:c700 b800:c702 ... b800:c79e
When we consider each character position to be specified by row and column:

(0,0) (0,1) ... (90,79) . . . . . . . . . (24,0) (24,1) ... (24,79)
then a very simple mapping exists between a screen position and its corresponding video address:

address == base + 160*row + 2*col
where base points to the appropriate segment. The following function places a character with its attribute in video memory so that it displays at the appropriate row and column on the screen:

void vputc(char c, int row, int col, char attr) { char far *p = base + 160*row + 2*col; *p++ = c; *p = attr; }
To retrieve the character at a certain (row, column) position is just as simple:

char far *p = base + 160*row + 2*col; char c = *p;
It is now an easy matter to implement pop-up windows. The technique is essentially:
1. Save the text and attributes of the area to be overlaid
2. Blank-fill the area (if necessary)
3. Display the new window information
4. Process user commands
5. Restore the original area
Listing 3 shows the functions that do the job. The value 0x0720 in function vclear_area represents the "normal" attribute (0x07) and the space character (0x20). Remember that the PC has a "little-endian" architecture, so the bytes of 0x0720 get swapped in memory, which is just what we want (i.e., character first (0x20), then attribute (0x70)). Macros for processing video attributes are shown in the file video.h in Listing 4.

Exercise
Write a function print_screen that behaves like the PrintScreen key on the PC: it sends to the printer an image of the text on the screen. See Listing 5 for a possible solution (after you've worked it out yourself, of course).

PC System Memory — Volatile Storage
The PC has a system clock that "ticks" by incrementing the unsigned long integer at location 0000:046c every 55 milliseconds (approximately 18.2 times per second). The time functions in the Standard C library use this tick count to compute the current time, but you can use it as a stopwatch also. You already know how to retrieve the value at that address:

far unsigned long *tptr = MK_FP(0,0x46c); unsigned long tval = *tptr;
You can retrieve this value at two different times in your program and compute the elapsed time between them. Subtract the earlier tick count from the later one to compute how many clock ticks have elapsed. To find the number of seconds, divide by 18.2. You can implement a count-down timer like this:

/* Pause for n clock ticks: */ far unsigned long *tptr = MK_FP(0,0x46c); unsigned long start = *tptr; unsigned long diff; do diff = *tptr - start; while (diff < n);
There is a minor problem here, however. Most of today's optimizing compilers are "smart" enough to infer that the value of diff doesn't change — you're calculating the same value over and over. So why bother to do all that unnecessary work? Such compilers will generate code equivalent to

diff = *tptr - start; while (diff < n) ;
A really good compiler might even notice that this is an infinite loop and give you a warning. The reason for this mess, of course, is that the compiler doesn't know that the value tptr references can change due to forces outside of your program. If you tell the compiler that the storage for the clock tick is volatile, it won't try any such optimizations. Replace the first line above with:

volatile far unsigned long *tptr = MK_FP(0,0x46c);
See Listing 6 and Listing 7 for an implementation of some useful stopwatch functions. The program in Listing 8 uses some of these functions to print each clock tick on the screen as it occurs until you abort from the keyboard.

Encapsulation
Good programming practice calls for you to hide details of implementation from the user (in this case, a programmer) that he doesn't need to know. This practice is called encapsulation. For example, to implement a stack of integers, I might provide the user with this include file:

/* stack1.h */ #define STACK_ERROR (-32768) void stack_init(void); int stack_push(int); int stack_pop(void);
The user has no idea how I implemented the stack, nor has he direct access to the stack itself. Listing 9 has an array implementation of a stack. I declared the array and its stack pointer index static to hide them from the user.
This solution supports only a single stack. If I wanted to allow multiple stacks, I might give the user this include file:
     /* stack2.h */

     #define STACK_ERROR (-32768)

     int stack_open(void);
     int stack_push(int,int);
     int stack_pop(void);
     void stack_close(int);
The function stack_open returns a handle to a stack, which you use when calling the other functions. A suitable array-based implementation is in Listing 10. You should immediately notice some disadvantages:

There is a limit to the number of stacks.

The size of each stack is fixed.

Unused stacks will waste memory space.
There has to be a better way. It would be nice if you could just define a new stack data type, such that whenever the user needed one, he could just declare one:
    stack s1;
This, of course, is what C++ is all about. You can simulate this in C with typedefs and pointers to structures:
     typedef struct stack
     {
        size_t size;
        int *data;
        int ptr;
     } Stack;
With this approach, you can ask for as many stacks as you want, each of any size, limited only by available memory:
    Stack *s1 = stack_create(10);
    Stack *s2 = stack_create(100);
The function stack_create dynamically allocates a stack object from the heap and returns a pointer to it. You use that pointer as a handle to the stack in other stack processing functions. Listing 11 shows the interface, and Listing 12 shows the implementation.
There is still some danger lurking. The user might be tempted to directly access the members of a stack structure. Since you gave him the structure definition, and since he has pointers to play with, nothing can stop him from doing something like
    Stack *s1 = stack_create(10);
    s1->size = 20; /* Bad move! */
You can't prevent a malicious user from misusing a pointer handle. What you can do is remove the temptation. How? By hiding the structure definition itself.

Incomplete Types
An incomplete type is a type whose size cannot be determined at compile time. One example is in the declaration

extern int a[];
This statement says that a is an array of unknown size, so don't use sizeof — you'll get an error message. This definition is quite different from

extern int *a;
where sizeof(a) is the size of a pointer. You can hide the very layout of a stack structure from the user by declaring it an incomplete type. Notice that the structure layout in the include file in Listing 13 is missing. The statement
    struct stack;
says that there is a structure with the tag name stack, but nothing else — it is an incomplete type. The compiler won't complain as long as you use only pointers to such a structure. Now you can move the structure definition into the implementation file — the user doesn't even see it (see Listing 14) . Since he doesn't know what the data members are, he is less likely to ever dereference a pointer to a stack object. By the way, you can combine the imcomplete type declaration and typedef into a single statement:
    typedef struct stack Stack;
The compiler will infer that struct stack is an incomplete type.

Conclusion
I began this three-part series stating that in theory pointers were easy to understand. (I suppose I said that so you would keep reading). I then proceeded to discuss

indirection

pointer arithmetic

pass-by-reference semantics

generic pointers (pointers to void)

the const qualifier

pointers and arrays

pointers to functions

the volatile qualifier

encapsulation via pointers to incomplete types
So maybe I lied. (You know what they say: "In theory, theory and practice are the same, but in practice they're not"). Then again, I chose my words carefully. I said that mastering pointers comes naturally from following a few basic principles and techniques. If you consider the number of items in the list above to be "few," then I told the truth. In any case, if you now understand these concepts, and if you'll apply them only when you need them (raise your right hand and repeat after me...), then I hereby pronounce you a responsible C programmer. Tell your boss she can trust you to write real programs.