Code Capsules

Bit Handling in C++, Part 1

Chuck Allison


Chuck Allison is a regular columnist with CUJ and a software architect for the Family History Department of the Church of Jesus Christ of Latter Day Saints Church Headquarters in Salt Lake City. He has a B.S. and M.S. in mathematics, has been programming since 1975, and has been teaching and developing in C since 1984. His current interest is object-oriented technology and education. He is a member of X3J16, the ANSI C++ Standards Committee. Chuck can be reached on the Internet at allison@decus.org, or at (801)240-4510.

The Standard C++ Library

When ANSI Committee X3J16 convened in 1989, it planned to at least standardize the C++ library stream classes. At the close of the third meeting in July 1990 the Library Working Group emerged with more specific direction: 1) standardize the stream classes, 2) define mechanisms to support language features such as the new and delete operators and exceptions, 3) define the relationship between the standard C library and the C++ library, and 4) define a standard string class. Working group members sometimes call this list the "Blood in the Streets Manifesto" to indicate what might happen if we fail. I am happy to report that after three years of difficult work, our blood seems destined to stay where it is. We have completed the above four items, modulo some fine tuning.

The overall charter of any standards committee is to standardize existing practice. It is not feasible, however, for the committee to design a comprehensive object-oriented library. A large group of volunteers cannot accomplish what a small, focused department can (plus the pay is not as good). It would be more reasonable to standardize a small number of popular, low-level classes as building blocks for a wide spectrum of applications. To determine the existing industry practice in development of libraries, in June 1991 I volunteered to review the libraries which a number of vendors had submitted to the committee. All submissions offered at least one string class and a number of container classes, such as lists, vectors, queues, stacks, and sets. Many vendors' libraries also supported bit sets. Now, two years later, the following classes are part of the standard C++ library (in addition to streams and language support):

Class       Description
string      dynamic array of characters
wstring     dynamic array of wide characters
dynarray    dynamic array of anything (vector)
bits        fixed-length bitset
bitstring   dynamic-length bit string
The string class, which started with contributions from Aron Insinga of Digital, and was completed through the efforts of Uwe Steinmueller of Siemens Nixdorf and Pete Becker at Borland, supports most of C's string.h functionality on dynamically-sized arrays of bytes. Beman Dawes adapted the string class to handle arrays of wide characters, of type wchar_t. (FYI: this is a keyword representing a distinct, overloadable type in C++). Uwe also designed the dynarray class. I was interested in improving on C's awkward bitwise support, and designed the last two classes. In this article I will discuss the bitstring class.

Bit Sets

C's bitwise operators allow you to process an integer as an ordered set of bits. This ability is useful for system interface operations, which often interpret individual bits as flags. If you want more bits than can fit in an unsigned long, however, you're out of luck. An extended version of such a bit array can come in handy, though. For example, I have developed a user interface library which provides picklists, scrollable popup menus that allow multiple choices. For efficiency, I like to track the choices with only one bit per entry. Bit arrays are also handy with sets. You can simulate a set of integers with a bit array: if the bit in position n is set, then the integer n is a member of the set. Pascal and Modula-2 both implement sets this way. (See last month's Code Capsule for a C implementation of bitsets.)

The first bitset class proposal submitted to X3J16 tried to meet all of these needs with a single class. After much discussion and little agreement among working members, the committee realized that existing practice called for two distinct abstractions: fixed-length bit sets (for a collection of flags, say) and bit strings (i.e., dynamic bit arrays). One year and three committee meetings later, these abstractions became class bits and bitstring, respectively.

Class bitstring

The bitstring class provides much of the same functionality as the string class (see the sidebar "A Simple String Class"). bitstring supports dynamically-sized bit strings with operations such as insert, remove, replace, and concatenate, as well as the basic bit operations — set, reset, test, and toggle (see Listing 4) . Since a bitstring is in fact a string, the bit numbering begins with bit 0 on the left, and increases as you move to the right (this is the reverse of numeric bitfields, which are numbered from right to left). I can now easily implement a picklist by embedding a bitstring:

class Picklist : public Popmenu
{
    bitstring picks;
    // . . .
public:
    Picklist(.., nitems,..)
        : picks(0,nitems), .. {}
    // . . .
};
When the user highlights the nth entry, I set bit n in the bitstring:

picks.set(n);

Member Function Descriptions

This section is a modified excerpt from the official proposal which describes the semantics of each member function. Since the library group is still deciding how to integrate exceptions into the standard library, I just mention them briefly here. The names and uses of exceptions are subject to change. I use assertions in place of exceptions in the implementation (see Listing 5 and Listing 6) . Throughout the rest of this article, the value NPOS, which stands for "not a position," is the largest possible value of type size_t, which is equivalent to (size_t)(-1). A bitstring can hold up to NPOS - 1 bits.

1.0 Constructors

Synopsis

bitstring()
bitstring(unsigned long n, size_t nbits)
bitstring(const string& s)
bitstring(const bitstring& b)

1.1 Constructor bitstring()

Description

Creates a bit string of length zero.

1.2 Constructor bitstring(unsigned long n, size_t nbits)

Description

Creates a bit string at least nbits long, initialized with the contents of n. The constructor drops no significant bits, i.e., this->length == max(nbits, N+1), where N is the bit position of the highest-order one bit. The constructor pads the higher order bits with zeroes if necessary, to fill out nbits bits. The common usage of this constructor would be to initialize a bitstring of a certain length with zeroes, e.g.,

bitstring x(0,16);
Be aware when using a non-zero value initializer that the bits will be "reversed." For example, the bit pattern for the decimal number 21025 is 0101001000100001, but the output from the following code

bitstring x(21025,16);
cout << x << endl;
is 1000010001001010.

1.3 Constructor bitstring(const string& s)

Description

Reads bit characters ('1's and '0's) from s and sets the corresponding bits in the new bitstring. The first bit read corresponds to bit 0.

Exceptions

Throws invalid_argument if any character in the string is not a '1' or '0'.

1.4 Constructor bitstring(const bitstring& b)

Description

The usual copy constructor.

2.0 Destructor bitstring()

Description

Frees resources and destroys the object.

3.0 Other Member Functions

3.1 Function string to_string()

Description

Creates a string of '1's and '0's representing the contents of *this. The first (left-most) character is bit 0.

Returns

The temporary string of '1's and '0's.

3.2 Function bitstring& operator=(const bitstring&)

Description

Standard assignment operator.

Returns

A reference to *this.

3.3 Function int operator==(const bitstring&b) const

Description

Compares *this to b for equality. Two bitstrings are equal if and only if they have the same number of bits and their bit patterns are identical.

Returns

Non-zero if the bitsets are equal, zero otherwise.

3.4 Function int operator!=(const bitstring&b) const

Description

Compares *this to b for inequality. Equivalent to !operator==(b);

Returns

Zero if the bitsets are equal, non-zero otherwise.

3.5 Functions set

Synopsis

bitstring& set(size_t n, int val = 1)
bitstring& set()

Description

These functions set one or more bits. The function set(size_t, int can also reset a bit, depending on the contents of val.

3.5.1 Function bitstring& set (size_t n, int val = 1)

Description

Sets the nth bit if val is non-zero, otherwise resets the bit. Appends a bit if n == length.

Returns

A reference to *this.

Exceptions

Throws out_of_range if n > length.

3.5.2 Function bitstring& set()

Description

Sets all bits,

Returns

A reference to *this.

3.6 Functions reset

Synopsis

bitstring& reset(size_t n)
bitstring& reset()

Description

These functions reset one or more bits.

3.6.1 Function bitstring& reset(size_t n)

Description

Resets the nth bit. Appends the bit if n == length.

Returns

A reference to *this.

Exceptions

Throws out_of_range if n > length.

3.6.2 Function bitstring& reset()

Description

Resets all bits.

Returns

A reference to *this.

3.7 Functions toggle

Synopsis

bitstring& toggle(size_t n)
bitstring& toggle()

Description

These functions toggle one or more bits.

3.7.1 Function bitstring& toggle(size_t n)

Description

Toggles the nth bit.

Returns

A reference to *this.

Exceptions

Throws out_of_range if n >= length.

3.7.2 Function bitstring& toggle()

Description

Toggles all bits.

Returns

A reference to *this.

3.8 Function int test(size_t n) const

Description

Tests if bit n is set.

Returns

Non-zero if the bit is set, zero otherwise.

Exceptions

Throws out_of_range if n >= length.

3.9 Function int any() const

Description

Tests if any bits at all are set.

Returns

Zero if all bits are 0, non-zero otherwise.

3.10 Function int none() const

Description

Tests if no bits at all are set. Equivalent to !any.

Returns

One if all bits are 0, non-zero otherwise.

3.11 Function bits operator~() const

Description

Toggles all the bits of a copy of *this.

Returns

A toggled copy of *this.

3.12 Function size_t count() const

Description

Counts the number of bits that are set.

Returns

The number of 1 bits in the string.

3.13 Function bitstring& operator&=(const bitstring&b)

Description

Replaces the current object with the bitwise-AND of b and *this. This function behaves as if the shorter bitstring is filled with zeroes on the right, so the operands of the AND operation will be of equal length. The new length of the object is the maximum of the lengths of the old object and b.

Returns

A reference to *this.

3.14 Function bitstring& operator|=(const bitstring& b)

Description

Replaces the current object with the bitwise-OR of b and *this. This function behaves as if the shorter bitstring is filled with zeroes on the right, so the operands of the XOR operation will be of equal length. The new length of the object is the maximum of the lengths of the old object and b.

Returns

A reference to *this.

3.15 Function bitstring& operator^=(const bitstring& b)

Description

Replaces the current object with the bitwise-XOR of b and *this. This function behaves as if the shorter bitstring is filled with zeroes on the right, so the operands of the XOR operation will be of equal length. The new length of the object is the maximum of the lengths of the old object and b.

Returns

A reference to *this.

3.16 Function bitstring& operator>>=(size_t n)

Description

Modifies the current object by shifting *this right, by n bit positions. If n > length, then this function resets all bits. Shifting "right" by n means that bit n gets the value of bit 0, bit (n+1) gets bit 1, etc. This function resets any vacated bits.

Returns

A reference to *this.

3.17 Function bitstring& operator<<=(size_t n)

Description

Modifies the current object by shifting *this left by n bit positions. if n > length, then this function resets all bits. Shifting "left" by n means that bit 0 receives the value of bit n, bit 1 receives bit (n+1), etc.

Returns

A reference to *this.

3.18 Function bitstring operator>>(const bitstring& b, size_t n) const

Description

Shifts b right by n bit positions. Does not modify b. (See 3.16).

Returns

The shifted result in a temporary bitstring.

3.19 Function bitstring operator<<(const bitstring& b, size_t n) const

Description

Shifts b left by n bit positions. Does not modify b, (See 3.17).

Returns

The shifted result in a temporary bitstring.

3.20 Function bitstring& operator+= (constbitstring& b)

Description

Appends b destructively to *this.

Returns

A reference to *this.

Exceptions

Throws length_error if length >= NPOS - b. length.

3.21 Function bitstring& insert (size_t pos, const bitstring& b)

Description

Inserts b into *this starting at bit position pos. The bits of *this starting at pos follow the inserted copy of b. Appends b if pos == length.

Returns

A reference to *this.

Exceptions

Throws out_of_range if pos > length. Throws length_error if length >= NPOS - b.length.

3.22 Function bitstring& remove(size_t pos, size_t n = NPOS)

Description

Removes up to n bits from *this starting at bit position pos. Truncates, starting at position pos, if length - pos < n.

Returns

A reference to *this.

Exceptions

Throws out_of_range if pos > length.

3.24 Function bitstring&replace(size_t pos, size_t n, const bitstring& b)

Description

Equivalent to a remove followed by a replace in the obvious way. Included for convenience and efficiency.

Returns

A reference to *this.

Exceptions

Throws out_of_range if pos >= length. Throws length_error if the new length would be >= NPOS.

3.25 Function size_t find(int val, size_t pos = 0) const

Description

Searches forward, starting from position pos, for the first occurrence of a 0 bit (if val is zero) or a 1 bit (if val is non-zero).

Returns

The found bit position, if the search was successful, NPOS otherwise.

Exceptions

Throws out_of_range if pos > length.

3.26 Function size_t rfind(int val, size_t pos = NPOS) const

Description

Searches backwards from position pos (or from position length-1 if pos == NPOS) for the first occurrence of a 0 bit (if val is zero) or a 1 bit (if val is nonzero).

Returns

The found bit position, if the search was successful, NPOS otherwise.

Exceptions

Throws out_of_range if pos > length.

3.27 Function bitstring substr(size_t pos, size_t n = NPOS) const

Description

Creates a new bitstring consisting of the bit pattern starting at position pos. This function copies up to n bits or all bits to the end of the string, whichever number is smaller. If pos == length, this function creates an empty bitstring.

Returns

The new bitstring.

Exceptions

Throws out_of_range if pos > length.

3.28 Functions length()

Synopsis

size_t length()
size_t length(size_t n, int val = 0)

Description

These functions determine and/or modify the length of a bit-string.

3.28.1 Function size_t length() const

Description

Determines the number of bits in the string.

Returns

The number of bits.

3.28.2 Function size_t length(size_t n, int val = 0)

Description

Resizes (i.e., truncates or expands) the object so that this->length == n. If the length increases, this function extends the object on the right with 0s if val == 0, or with 1s otherwise.

Returns

The old length.

3.29 Function bitstring& trim()

Description

Truncates high-order 0 bits. If N is the highest-numbered 1 bit, then this->length becomes N+1.

Returns

A reference to *this.

4.0 Global Functions

4.1 Function ostream& operator<<(ostream& os, const bitstring& b)

Description

Sends the sequence of '1's and '0's corresponding to b to the output stream os.

Returns

A reference to the stream os.

4.2 Function istream& operator>>(istream& is, bitstring& b)

Description

Reads a contiguous sequence of '1's and '0's from a stream into b. Skips whitespace according to is.skipws. The first bit read is bit 0. The first non-whitespace, nonbit character terminates the read and remains in the stream. This function sets is.failbit if the first non-whitespace character is not a '1' or '0'.

Returns

A reference to the stream is.

4.3 Function bitstring operator&(const bitstring& b1, const bitstring& b2)

Description

Computes the bitwise-AND of operands b1 and b2. The length of the result equals the length of the longer operand. This function behaves as if, prior to the operation, the shorter operand was extended with high-order 0 bits until its length equaled that of the longer operand.

Returns

The result of the bitwise-AND in a temporary bitstring.

4.4 Function bitstring operator|(const bitstring& b1, const bitstring& b2)

Description

Computes the bitwise-OR of operands b1 and b2. The length of the result equals the length of the longer operand. This function behaves as if, prior to the operation, the shorter operand was extended with high-order 0 bits until its length equaled that of the longer operand.

Returns

The result of the bitwise-OR in a temporary bitstring.

4.5 Function bitstring operator^(const bitstring& b1, const bitstring& b2)

Des cription

Computes the exclusive-OR of operands b1 and b2. The length of the result equals length of the longer operand. This function behaves as if, prior to the operation, the shorter operand was extended with high-order zero bits until its length equaled that of the longer operand.

Returns

The result of the exclusive-OR in a temporary bitstring.

4.6 Function bitstring operator+(const bitstring& b1, const bitstring& b2)

Description

Appends b non-destructively to *this.

Returns

The result of the concatenation in a temporary bitstring.

Implementation Notes

Following what seems to be the most popular coding style, the class definition in Listing 4 specifies the public interface first, followed by the private artifacts of the implementation. Whenever possible, inline functions appear outside the class definition itself. This placement is not possible in the case of the functions word, offset, mask1, and mask0, because they return objects of the private nested type Block. These functions appeared as macros in last month's C version. As private, static member functions, they are available only to other bitstring member functions.

I pack the bits of a bitstring left-to-right into an array of integral blocks. You can substitute any unsigned type for unsigned int in the line

typedef unsigned int Block;
See last month's capsule for a detailed explanation of the packing and unpacking algorithms.

Last month I developed bit handling functions in C; this month I have restated them in C++. In that regard, this column presents nothing new. What is new this month is memory management and operator overloading. A bitstring expands or shrinks automatically as needed, and always uses the minimum number of blocks required to hold its bits. For example, the statements

bitstring b1, b2;
// . . .
b1 += b2;
append b2 to b1. Consider the implementation of operator+= from Listing 5:

bitstring& bitstring::operator+=(const bitstring& b)
{
     assert(nbits_ + b.nbits_ < NPOS);
     int oldlen = nbits_;

     length(nbits_ + b.nbits_);
     for (int i = 0; i < b.nbits_; ++i)
        (void) set_(oldlen+i,b.test_(i));

     return *this;
}
All this function has to do is resize the object and copy b's bits to the end of the object. If the number of bits in b2 is less than or equal to the number of unused bits in b1's allocated memory, then no memory reallocation is necessary. Whether the array needs to be resized or not, the member function length(size_t) resizes the object; if the array must be resized, it handles that too.

The implementation of the global operator+ is now trivial using operator+=:

bitstring operator+(const bitstring& b1, const bitstring&
b2)
{
     bitstring b(b1);
     return b.operator+=(b2);
}
We can do the same sort of thing for all the binary operators. Being small, the binary operators are good candidates for inlining, and therefore appear in bitstring.h.

The program in Listing 6 illustrates many of the member functions of the bitstring class. Next month I'll cover class bits, a template class for fixed-length bitsets.