Chuck Allison is a regular columnist with CUJ and a software architect for the Family History Department of the Church of Jesus Christ of Latter Day Saints Church Headquarters in Salt Lake City. He has a B.S. and M.S. in mathematics, has been programming since 1975, and has been teaching and developing in C since 1984. His current interest is object-oriented technology and education. He is a member of X3J16, the ANSI C++ Standards Committee. Chuck can be reached on the Internet at allison@decus.org, or at (801)240-4510.## Bitwise Operators

There are six bitwise operators in C (see Table 1) . As the program in Listing 1 illustrates, the bitwise-OR operator sets a bit in the result if either one of the corresponding operand bits is "set" (has a value of 1). For example:

01010101 (== 0x55) | 11110000 (== 0xf0) ------------- = 11110101 (== 0xf5)The bitwise-AND operator sets only those bits whose corresponding operand bits are both set:

01010101 & 11110000 ------------- = 01010000 (== 0x50)The bitwise-EXCLUSIVE-OR operator sets those bits when one or the other (but not both) of the original operand bits is set:

01010101 ^ 11110000 ------------- = 10100000 (== 0xa0)The bitwise-NOT operator "flips" each bit by making each 1 a 0, and vice-versa. The cast tounsigned charin Listing 1 is necessary because Standard-conforming compilers always promotecharstointsbefore applying bitwise operations. Without the cast the output would be:

~a == FFAAsinceprintfnever truncates significant values in numbers, andintson my machine are two bytes wide.The shift operators shift all bits left or right by the number of positions specified in the right operand. Bits that "fall off the end" are lost. The left-shift operator sets vacated bits to zero. When shifting signed integers to the right, some machines zero-fill while others replicate the sign bit. It is common practice, therefore, to declare integers unsigned, which will always be zero-filled when you shift them to the right.

## Accessing Individual Bits

Quite often you need to set, reset, or test individual bits. If not too many bits are involved, you might consider using bitfield structures. For example, suppose you are only interested in bits 4 and 5 of an integer. Logically, bits are numbered right-to-left, starting from 0. Each bit number represents a power of 2, signifying the weight of each digit in a number's binary representation. For example, bit 0 represents 20, bit 1 represents 21, bit 2 represents 22, and so on. On a little-endian machine, the following structure would typically be suitable:

struct bits4_5 { unsigned : 4; /* Skip bits 0-> 3 */ unsigned bit4 : 1; unsigned bit5 : 1; };A big-endian architecture typically calls for

struct bits4_5 { unsigned : 10; /* Skip bits 15 -> 6 */ unsigned bit5 : 1; unsigned bit4 : 1; };You can then process the bits via the named variables:

unsigned x; struct bits4_5 *xp = (struct bits4_5 *) &x; xp->bit4 = 0; xp->bit5 = 1; f(xp); /* May modify the bits */ if (xp->bi t4) . . .This technique is somewhat unattractive not only because it isn't very portable, but also because it is cumbersome with a large number of bits. The preferred method to manipulate arbitrary, individual bits is via the bitwise operators. For example, to set theith bit of an integer, form a mask with a 1 in theith bit position and zeroes elsewhere (I'll call it the "1-mask"), then bitwise-OR it into the number:

x | = (1u << i)To turn a bit off, form the complement of the above mask (the "0-mask") and bitwise-AND it in:

x &= ~(1u << i)The program in Listing 2 sets all the bits of an integer in turn, right-to-left (logically), and resets them in the same order. The quantityCHAR_BIT,fromlimits.h,is the number of bits in a byte.To toggle a bit, exclusive-OR the 1-mask into the number:

x ^= (1u << i)To see whether or not a bit is set, bitwise-AND the integer with the same mask:

x & (1u << i)This expression evaluates to0if the bit is not set, or non-zero (2 to theith power, in particular) if it is. The following useful idiom yields a0or1only:

!!(x & (1u << i))in case you want to use the result as an index into an array.The header file in Listing 3 defines macros for non-destructive versions of these operations, and declares some useful functions. The function implementations in Listing 4 use the

nbitsmacro to compute the size of anunsigned intin bits. The functionfputbprints an unsigned integer in binary form with bit-0 rightmost and with leading zeroes if necessary.fgetbreads a string of bit characters and converts it into anunsigned. The curious statement

sprintf(format," %%%d [01]", nb);builds a scan format string that skips whitespace and then reads up tonb1s and 0s, or reads until it scans a non-bit character. You might think that you could avoid thesprintfby using a variable-width format descriptor in the scan, such as:

fscanf(f," %*[01] ",nb,buf);Unfortunately, variable-width substitution doesn't work with scansets. (If this is Greek to you, see October 1992's Code Capsule, "Text Processing I".)The

countfunction computes the number of 1-bits in a number by right-shifting one bit at a time and testing bit 0. The test program in Listing 5 illustrates these functions.

## Large Bitsets

What if you need more bits than an integer holds? For example, suppose that you must create a "picklist" user interface object. A picklist is a scrollable, pop-up listbox that allows the user to choose multiple entries. The number of entries can easily exceed the number of bits in a long integer. The most efficient way to track the state of each entry is to associate a bitset with the picklist — if the user highlights theith entry, then you set theith bit.You can of course use an array of

unsignedsto hold the bits you need:

If you need

nbits, then you needN = ceil (n / BLKSIZ)array elements, whereBLKSIZis the number of bits in anunsigned(or whatever integral type you derive the array from). Ifnis not a multiple ofBLKSIZ,then there will be unused bits in the zeroth block (represented by XX in the preceding figure). Now, accessing a particular bit reduces to finding the particular block it resides in, and computing the offset for the mask to apply to that block. To find the right block, notice the following pattern:

Ifbits in this range are in this block b/BLKSIZ---------------------------------------------------[0 , BLKSIZ-1] N-1 0 [BLKSIZ , 2*BLKSIZ-1] N-2 1 . . . . . . . . . [(N-1)*BLKSIZ , n-1] 0 N-1bis the bit number in question, then

B + b/BLKSIZ == N-1whereBis the block we are after. Therefore,

B = N-1 - b/BLKSIZThe offset of bitbwithin blockBis simply

offset = b % BLKSIZIfbitsis the array name, then you can process theBth bit with these expressions:

bits[B] |= (1u << offset); /* set */ bits[B] &= ~(1u << offset); /* reset */ bits[B] ^= (1u << offset); /* toggle */ !!(bits[B] & (1u << offset)); /* test */The header file in Listing 6 presents an interface for processing arbitrarily large bitsets. In addition to the functions already discussed, it adds conversions to and fromunsigned, in case you are working only with word-sized bitsets and want to interface with your host environment. The statement

typedef struct bits Bits;declares structbitsas an incomplete type with synonymBits. The definition of the structure is in the implementation file. (See Listing 7 — If incomplete types are Greek to you, see last month's installment.) If you want to use another integral type for the base array, just substitute it forunsigned intin line 11 ofbits. c.A

Bitsobject consists of the number of bits allowed, the array of integers, the number of elements in the array, and a mask to reset the unused bits to zero that have inadvertently been set by operations on other bits. The test program in Listing 8 shows how you might use aBitsobject in an application.

## Bit Strings

The storage scheme in aBitsobject puts bit 0 last in the array, so bits are numbered right-to-left, just like they are in single integers. In situations where there is no numerical application of bitsets, like the picklist example above, it might seem more natural to number bits from left to right, like a string. It is trivial to modifybits.cto make this change in orientation. The orientation is represented as:

The block for the

bth bit is now

B = b / BLKSIZand the offset is calculated from the left instead of the right of the block:

offset = BLKSIZ - b%BLKSIZ - 1Listing 9 and Listing 10 summarize the changes required to transform the header file and implementation respectively from a bitset into a bit string. The test program is in Listing 11.

## Stay Tuned

There are a number of features we could and should add to these objects. It would be convenient to provide bitwise operations that apply to entire objects, for example

Bits *bp1, *bp2, *bp3; /* . . . */ bits_and(bp3,bp1,bp2); bits_shift_left(bp3,4); /* . . . */It would also seem natural to allow bit strings to grow and shrink, just like strings. Before we get too carried away, however, it's time to consider moving to C++. In C++, we can overload operators so that the above excerpt looks like:

Bits bp1, bp2; // . . . Bits bp3 = bp1 & bp2; bp3 <<= 4;The next two installments will cover the two bitset classes that were voted into the standard C++ library in March, 1993 in Portland.