Code Capsules

Bit Handling in C++, Part 2

Chuck Allison


Chuck Allison is a regular columnist with CUJ and a software architect for the Family History Department of the Church of Jesus Christ of Latter Day Saints Church Headquarters in Salt Lake City. He has a B.S. and M.S. in mathematics, has been programming since 1975, and has been teaching and developing in C since 1984. His current interest is object-oriented technology and education. He is a member of X3J16, the ANSI C++ Standards Committee. Chuck can be reached on the Internet at allison@decus.org, or at (801)240-4510.

The bits Class Template

The standard C++ library has two classes for bit manipulation: bitstring and bits. Last month I discussed the bitstring class, which defines objects that behave like an array of bits that expands or contracts according to your needs. This month's installment explains the bits class, an abstraction which extends C's bitwise semantics by allowing easy access to bits, by allowing an arbitrary (but fixed) number of bits in a bitset, and by adding important new functionality. Following last month's format, I present here an excerpt from the official proposal accepted by the standards committee, a working implementation, and examples of how to use the class.

Class bits accommodates a fixed-length collection of bits. You can think of a bits object as an arbitrarily large unsigned integer. It is actually a class template, with the number of bits in the collection as the template parameter. (See the sidebar "Templates in a Nutshell.") It is highly suitable for interfacting with the host operating system, and is designed for efficiency. (It can be stack-based.) Here's a sample program run on a machine with 16-bit integers:

// tbits.cpp:
// Set some bits and display the result
#include <iostream.h>
#include <stddef. h>
#include <limits.h>
#include "bits.h"

main()
{

const size_t SIZE = CHAR_BIT * sizeof(int);
bits<SIZE> flags;
enum open_mode {in, out, ate, app, trunc, binary};
              
flags.set(in);
flags.set(binary);
cout << "flags:" << flags <<" (0x" << hex << flags.to_ushort() << ")\n";
cout << "binary?" << (flags.test(binary) ? "yes" : "no") << endl;
return 0;
}

Output
flags: 0000000000100001 (0x21)
binary? yes

Member Function Descriptions

This section is a modified excerpt from the official proposal which describes the semantics of each member function. For a quick look at the class interface, see Listing 6. Since the library group of the joint C++ standards committee is still deciding how to integrate exceptions into the standard library, I just mention them briefly here. The names and uses of exceptions are subject to change. I use asserts in place of exceptions in the implementation (see Listing 7) .

1.0 Constructors

Synopsis

bits()
bits(unsigned long n)
bits(const string& s)
bits(const bits<N>& b)

1.1 Constructor bits()

Description

Initializes all bits to zero.

1.2 Constructor bits(unsigned long n)

Description

Initializes the object with the bits of n. If N >sizeof(unsigned long) * CHAR_BIT, sets the extra bits to zero.

1.3 Constructor bits(const string& s)

Description

Each character of s is interpreted as a bit (a string of 1s and 0s is expected). In typical integral fashion, treats the last (right-most) character of s as bit 0.

Exceptions

Throws invalid_argument if a character other than 1 or 0 is encountered.

1.4 Constructor bits(const bits<N>& b)

Description

Standard copy constructor.

2.0 Destructor

No destructor required.

3.0 Other Member Functions

3.1 Function unsigned short to_ushort() const

Description

Converts the n least significant bits of *this (where n == sizeof(unsigned short) * CHAR_BIT) to an unsigned short. This is useful when the bits represent flags in a word passed to the operating system.

Exceptions

Throws overflow if N > n and any of the bits above position n-1 are set.

3.2 Function unsigned long to_ulong() const

Description

Converts the n least significant bits of *this (where n == sizeof(unsigned long) * CHAR_BIT) to an unsigned long.

Exceptions

Throws overflow if N > n and any of the bits above position n-1 are set.

3.3 Function string to_string() const

Description

Creates a string of 1s and 0s representing the contents of *this. As with unsigned integers, treats the last character as bit 0.

Returns

The temporary string of 1s and 0s.

3.4 Function bits<N>& operator:(const bits<N>& b)

Description

Standard assignment operator.

Returns

A reference to *this.

3.5 Function int operator==(const bits<N>& b) const

Description

Compares *this to b for equality. Two bitsets are equal if and only if their bit patterns are identical.

Returns

Non-zero if the bitsets are equal, zero otherwise.

3.6 Function int operator!=(const bits<N>& b) const

Description

Compares *this to b for inequality. Equivalent to !operator==().

Returns

Zero if the bitsets are equal, non-zero otherwise.

3.7 Functions set

Synopsis

bits<N>& set(size_t n, int val = 1)
bits<N>& set()

Description

These functions set one or more bits. The function set(size_t, int) can reset a bit, depending on val.

3.7.1 Function bits<N>& set[size_t n, int val)

Description

Sets the nth bit if val is non-zero, otherwise resets the bit.

Returns

A reference to *this.

Exceptions

Throws out_of_range if n is not in [0,N-1].

3.7.2 Function bits<N>& set()

Description

Sets all bits.

Returns

A reference to *this.

3.8 Functions reset

Synopsis

bits<N>& reset(size_t n)
bits<N>& reset()

Description

These functions reset one or more bits.

3.8.1 Function bits<N>& reset(size_t n)

Description

Resets the nth bit.

Returns

A reference to *this.

Exceptions

Throws out_of_range if n is not in [0, N-1].

3.8.2 Function bits<N>& reset()

Description

Resets all bits.

Returns

A reference to *this.

3.9 Functions toggle

Synopsis

bits<N>& toggle(size_t n)
bits<N>& toggle()

Description

These functions toggle one or more bits.

3.9.1 Function bits<N>& toggle(size_t n)

Description

Toggles the nth bit.

Returns

A reference to *this.

Exceptions

Throws out_of_range if n is not in [0,N-1].

3.9.2 Function bits<N>& toggle()

Description

Toggles all bits.

Returns

A reference to *this.

3.10 Function bits operator~() const

Description

Toggles all the bits of a copy of *this.

Returns

A toggled copy of *this.

3.11 Function int test(size_t n) const

Description

Tests if bit n is set.

Returns

Non-zero if the bit is set, zero otherwise.

Exceptions

Throws out_of_range if n is not in [0,N-1].

3.12 Function int any() const

Description

Tests if any bits at all are set.

Returns

0 if all bits are 0, non-zero otherwise.

3.13 Function int none() const

Description

Tests if no bits at all are set.

Returns

Non-zero if all bits are 0, 0 otherwise.

3.14 Function bits<N>& operator&=(const bits<N>& b)

Description

Performs a destructive bitwise-AND of b into *this.

Returns

A reference to *this.

3.15 Function bits<N>& operator/=(const bits<N>& b)

Description

Performs a destructive bitwise-OR of b into *this.

Returns

A reference to *this.

3.16 Function bits<N>& operator^=(const bits<N>& b)

Description

Performs a destructive bitwise exclusive-OR of b into *this.

Returns

A reference to *this.

3.17 Function bits<N>& operator>>=(size_t n)

Description

Shifts *this right destructively (i.e., in place) by n bit positions. Resets all bits if n > N. To shift "right" by n means that bit 0 receives the value of bit n, bit 1 receives bit (n+1), etc.

Returns

A reference to *this.

3.18 Function bits<N>& operator<<=(size_t n)

Description

Shifts *this left destructively by n bit positions. Resets all bits if n > N. To shift "left" by n means that bit n receives the value of bit 0, bit (n+1) receives bit 1, etc.

Returns

A reference to *this.

3.19 Function bits<N> operator>>(size_t n) const

Description

A non-destructive version of operator>>=().

Returns

The results of the shift in a temporary bits object.

3.20 Function bits<N> operator<<(size_t n) const

Description

A non-destructive version of operator<<=().

Returns

The results of the shift in a temporary bits object.

3.21 Function size_t count( ) const

Description

Counts the number of bits that are set.

Returns

The number of 1 bits in *this.

3.22 Function size_t length( ) const

Description

Returns the fixed-size length of the object.

Returns

   N.

4.0 Global Functions

4.1 Function ostream& operator<<(ostream& os, const bits<N>& b)

Description

Sends a sequence of N 1s and 0s corresponding to the bit pattern of *this to the stream os,

Returns

A reference to the stream os.

4.2 Function istream& operator>>(istream& is, bits<N>& b)

Description

Reads a sequence of up to N 1s and 0s from the stream is, after skipping whitespace. The first non-bit character thereafter terminates the read and remains in the stream. The corresponding bit pattern is reproduced in b. Treats the last 1 or 0 read from the stream as bit 0 of b.

Returns

A reference to the stream is.

4.3 Function bits<N>operator& (const bits<N>& b1, const bits<N>& b2)

Description

Performs a bitwise-AND of b1 and b.

Returns

The results of the bitwise-AND in a temporary bits object.

4.4 Function bits<N>operator / (const bits<N>& b1, const bits<N>& b2)

Description

Performs a bitwise-OR of b1 and b.

Returns

The results of the bitwise-OR in a temporary bits object.

4.5 Function bits <N> operator^ (const bits<N>& b1, const bits<N>& b2)

Description

Performs a bitwise exclusive-OR of b1 and b.

Returns

The results of the exclusive-OR in a temporary bits object.

Design Notes

Having an expression instead of a type as a template parameter has the following effects:

Since a bits object is an extension of unsigned integers as far as bitwise operations are concerned, a collection of bits behaves like a number, in that bit 0 is the right-most bit. To be consistent with C bitwise operations, the statements

bits<8> b;
b | = 5;
cout << b << endl;
give the result

00000101
that is, the bits of 5 are ORed with b (via the constructor bits(unsigned long)).

As you can see in Listing 7, I've taken some liberties in the implementation of this class by changing the type of the template parameter to a size_t. The reason for originally making the number of bits an unsigned long was to guarantee a minimum size across platforms (the ANSI C standard states that an unsigned long must be at least 32 bits wide). However, this would require that count and length return an unsigned long. As proof that this is unnatural, I offer the fact that I forgot to do so (the functions return a size_t because it "feels right") and no one on the committee noticed. (Actually, Bill Plauger finally noticed while he was editing the standard library documents, but that was four months after the bits class became official.) Furthermore, the corresponding functions in the string and bitstring classes also return a size_t. (They have no choice.) To be consistent with these classes, therefore, and because it just makes sense, I shall propose to the committee that we approve the obvious and make the template parameter a size_t.

I'm also thinking of changing the name from bits to bitset. This would allow you to refer to an object as a "bitset" instead of always having to say a "bits object." (Who would ever call it a "bits"?) And one could argue that the function to_ushort( ) is superfluous, since it is equivalent to (unsigned short) to_ulong( ).

Implementation Notes

The Code Capsule "Bit Handling in C" (CUJ, November, 1993) provides a thorough explanation of the internals of using an integral array to store and access individual bits, so I won't repeat it here. The techniques found therein serve both the bitstring and bits classes. The implementation of the string class is in last month's installment ("Bit Handling in C++, Part 1," CUJ, December 1993). Listing 8 has a test program that exercises most of the member functions of the bits class.

Sets of Integers

For those of you who miss some of the high-level features of Pascal and Modula-2, the bits class gives you sets of integers almost for free. Just define a bits object of a size appropriate for your application and do the following:

For the set operation: Do this:

insert x into s        s. set(x)
remove x from s        s. reset(x)
x member of s?         s. test(x)
complement of s        s. toggle() or ~s
s1 + s2 (union)        s1 / s2
s1 * s2 (intersection) s2 & s2
s1 - s2 (difference)   see below
s1 <= s2 (subset)      see below
s1 >= s2 (superset)    see below
If this still seems too "low-level," it is a trivial matter to define a set-like interface to the bits class.
Listing 9 defines a class template called Intset that has all the basic set operations along with an ostream inserter. The only operations that take any thought at all (but only very little) are set difference and subset. To remove from s1 the elements of s2, just reset in s1 the bits that are set in s2:

s1.bitset &= ~s2.bitset;
// see Intset<N>::operator-
If s1 is a subset of s2, then s1 is nothing more nor less than the intersection of the two sets:

s1 == s1 & s2
// true iff s1 <= s2
The test program in Listing 10 shows how to use the Intset class.

Conclusion

The acceptance of these two bit handling classes by the C++ standards committee shows that the needs of the system programmer have not been forgotten. Much of the hullabaloo over object-oriented technology emphasizes inheritance hierarchies of polymorphic, high-level abstract data types. These concepts are best left to specialized libraries and applications while the standard rightly focuses on commonly needed, low-level abstractions. If you have any comments on these classes, please contact me at the email address in the by-line at the bottom of the first page of this article.