[llvm-commits] [PATCH] SmallEnumSet

Mon Jan 12 22:32:29 PST 2009

On Jan 12, 2009, at 12:27 PM, Talin wrote:

> Before I begin, I should mention that one of my self-imposed design  
> constraints
> is that the class should be no less efficient than using the  
> traditional "flags"
> idiom. This means that for small sets of flags (less than 32), every  
> method in
> this class should theoretically be optimizable down to a single  
> machine
> instruction, and for larger sets the number of machine instructions  
> should be
> roughly on the order of the number of machine words in the bitset.
>
> That being said, I have not actually *verified* that this happens -  
> rather,
> based on my (admittedly somewhat incomplete) knowledge of what  
> optimizers can
> and cannot do, I have attempted to apply standard rules of thumb for  
> writing
> optimizable code. Some of those rules of thumb have evolved from  
> working with
> LLVM itself, and may represent an unwarranted level of confidence in  
> the
> compiler's ability to optimize, especially when the compiler isn't  
> LLVM-based.
>
> With that in mind, I believe some of my responses below will make  
> more sense.
>
That's fine, but once you create it, people will use/abuse it as much  
as they can. :-) So what was designed to fit into a register would be  
way too big.

>>> This patch contains a new container class, SmallEnumSet, and unit
>>> tests for it.
>>>
>>> SmallEnumSet is inspired by SmallVector and the other "small" sets
>>> in ADT.
>>>
>> From looking at the code, it seems like this is very similar to the
>> BitVector. The difference being that the size is known at compile
>> time. Is it possible either to use the BitVector or modify  
>> BitVector's
>> implementation so that you know the size at compile time (if that's
>> important enough)?
>
> You would need something like SmallBitVector in order for this to  
> work.
> BitVector *always* allocates memory, whereas SmallEnumSet *never*  
> does, so the
> two are completely different in this respect.
>
> Having the structure be a fixed size is important for small sets of
> flags,because it means that the structure will get passed in a  
> single machine
> register, just as a flags field would be.
>
I thought about this after I wrote the email. The architecture of what  
you are doing seems to be there in BitVector already (or could be  
added to it). It would be preferable if you could reuse the BitVector  
code in some way. Maybe you could factor out the ivar data part like  
this:

template <uint32_t N>
struct Vec {
   char BitVec[N];
   // ...
};

template <>
struct Vec<0> {
   char *BitVec;
   Vec() : BitVec(0) {}
   ~Vec() { delete [] BitVec; }
   // ...
};

template <uint32_t N>
class BitVectorImpl {
   Vec<N> Bits;
   // ...
};

typedef BitVectorImpl<0> BitVector;

You would have to modify the places in the current BitVector where it  
deallocates and/or grows the "Bits" ivar, but it would be a way of  
getting the size stuff set at compile time. Do you think that this  
would be sufficient for your needs? (I haven't done tests to see if  
this change would hurt code size or performance, though I find it  
unlikely.)

>>> The keys of the set are enumeration constants, which range from 0 to
>>> some maximum value which is known at compile time. Internally,
>>> SmallEnumSet maintains a fixed-length bit vector, so that membership
>>> in the set is represented by a single bit per possible key.
>>>
>> Enumerated values can go from negative to positive. They don't have  
>> to
>> start from 0.
>
> For enumerations that correspond to a set of flags, having values  
> less than zero
> makes no sense. So my response to that is "don't do that".
>
Okay. This is less like a "set" to me and more like your "flags"  
description. I think that a change in name would help things along.  
Either with using BitVector or something else.

>>> 	• It has a set-like interface, so you can use "add", "contains" and
>>> other set methods.
>>
>> The std::set doesn't use "add" or "contains". If you're going to call
>> this a set, you should use the same nomenclature: e.g. "insert",
>> "find". You should also have an "iterator" associated with this class
>> to keep things consistent.
>
> Sorry, for some reason I was following the Java set protocol rather  
> than the C++
> one. Habit I guess.
>
:-)

> You would probably want "test" (which returns a bool) rather than  
> "find" (which
> returns an iterator).
>
Sure. I think that BitVector uses "test".

>>> At the same time, SmallEnumSet retains many of the advantages of an
>>> integer flags field - it allocates no memory, and can efficiently be
>>> passed or returned by value between functions.
>>
>> This is only true if the size of the enum set is small enough that  
>> the
>> array holding the values can be passed back efficiently (without
>> calling memcpy or something similar).
>
> Agreed, however in practice I don't think this is a problem.
>
> Large collections of booleans are fairly rare; Of the ones that  
> exist, most are
> either sparse (in which case a bit vector is not the best  
> representation), or
> have keys which are not known at compile time or are otherwise  
> unbounded (in
> which case, BitVector is a better choice).
>
> For those rare cases where this class makes sense, I think it is  
> best to let the
> application programmer make the decision as to whether the set  
> should be passed
> by reference or by value - by choosing whether to pass the value or  
> the address
> between methods.
>
> Mainly, I expect this class to be used for fairly small sets, but I  
> don't want
> the programmer to have to switch to a different representation if  
> the number of
> flags bits grows to larger than 32.
>
Alright. Just as long as it's clear that passing by value may not be a  
win in all situations.

>> +  bool empty() const {
>> +    for (unsigned i = 0; i < NumWords; ++i) {
>> +      if (bits[i] != 0) {
>> +        return false;
>> +      }
>> +    }
>> +
>> +    return true;
>> +  }
>>
>> empty() should really execute in constant time. Maybe you could  
>> rename
>> it. BitVector uses "none()" and "any()".
>
> Well, it does execute in constant time(), because NumWords is a  
> constant :) More
> seriously, I'm assuming that the compiler will unroll this loop,  
> especially in
> the most common case where NumWords == 1.
>
You should check this out to make sure. I would still like to see it  
be O(1) for all values, though.

>> +  bool containsAll(const SmallEnumSet & in) const {
>> +    for (unsigned i = 0; i < NumWords; ++i) {
>> +      if ((~bits[i] & in.bits[i]) != 0) {
>> +        return false;
>> +      }
>> +    }
>> +
>> +    return true;
>> +  }
>>
>> How is this different from "operator==()"?
>
> It's not the same - this method is a subset test, it returns true if  
> 'in' is a
> subset of 'this'.
>
I see.

>> +  SmallEnumSet & addRange(EnumType first, EnumType last) {
>> +    unsigned firstWord = wordForIndex(first);
>> +    unsigned lastWord = wordForIndex(last);
>> +    if (firstWord == lastWord) {
>> +      bits[firstWord] |= maskForIndex(first) & ~maskForIndex(last);
>> +    } else {
>> +      bits[firstWord] |= maskForIndex(first);
>> +      for (unsigned i = firstWord + 1; i < lastWord; ++i) {
>> +        bits[i] = -1;
>> +      }
>> +      bits[lastWord] |= ~maskForIndex(last);
>> +    }
>> +
>> +    return * this;
>> +  }
>>
>> Same comment above. Call this "insert", etc. This should probably  
>> take
>> iterators.
>
> Unfortunately, there's no efficient way to implement iterators with  
> a bit vector
> - advancing an iterator requires examining every potential bit to  
> determine
> whether its on or off, so that you can skip over the ones that are  
> off (i.e. not
> in the set.)
>
> Also, I don't believe that there's a compelling practical use case  
> for iterating
> over a set like this.
>
That's kind of the downfall of calling this a "set". :-)

>> +  static SmallEnumSet noneOf() {
>> +    return SmallEnumSet();
>> +  }
>>
>> This name isn't very informative. What does this mean?
>
> It was supposed to return the empty set.
>
Why not just name it "emptySet" or something?

>> +  static SmallEnumSet of(EnumType v0) {
>> +    return SmallEnumSet().add(v0);
>> +  }
>> +
>> ...
>> +  static SmallEnumSet of(
>> +      EnumType v0, EnumType v1, EnumType v2, EnumType v3, EnumType  
>> v4,
>> +      EnumType v5, EnumType v6, EnumType v7, EnumType v8, EnumType
>> v9) {
>> +    return
>> SmallEnumSet().add(v0).add(v1).add(v2).add(v3).add(v4).add(v5)
>> +        .add(v6).add(v7).add(v8).add(v9);
>> +  }
>>
>> Yikes! Please don't do this. Use "inserts" instead that loop over a
>> vector of values or something similar.
>
> I want to avoid a loop because the most common use case for this  
> method is in
> constructing constants. For example, constructing a mask:
>
>   static Atttributes METHOD_ATTRS = Attributes.of(
>      InlineAttr, ExternAttr, IntrinsicAttr, PureAttr,  
> ConstructorAttr);
>
> The compiler should inline all of the calls to "add":
>
>   this = SmallEnumSet();
>   this.bits[0] |= k0;
>   this.bits[0] |= k1;
>   this.bits[0] |= k2;
>   this.bits[0] |= k3;
>   this.bits[0] |= k4;
>   this.bits[0] |= k5;
>   this.bits[0] |= k6;
>   this.bits[0] |= k7;
>   this.bits[0] |= k8;
>   this.bits[0] |= k9;
>
> And then should combine all of the constants:
>
>   this = SmallEnumSet();
>   this.bits[0] |= (k0 | k1 | k2 | k3 | k4 | k5 | k6 | k7 | k8 | k9);
>
> So in the end, all that happens (hopefully) is we are assigning a  
> constant
> integer to a symbol.
>
We have a precedent for this with the Machine Instruction builder. We  
do something like:

   BuildMI( ... ).add(...).add(...)

etc. It doesn't look glamorous, but it keeps us from having to create  
methods with arbitrarily long lists of parameters. And the compiler  
should be able to optimize the code in the same manner.

>> +  friend SmallEnumSet operator |(
>> +      const SmallEnumSet & a, const SmallEnumSet & b) {
>> +    return unionOf(a, b);
>> +  }
>> +
>> +  friend SmallEnumSet operator &(
>> +      const SmallEnumSet & a, const SmallEnumSet & b) {
>> +    return SmallEnumSet(a).intersectWith(b);
>> +  }
>>
>> Is it necessary to make these "friend" methods? They seem to use
>> publicly accessible methods. (I haven't tried to compile them without
>> the friend keyword, so I don't know.)
>
> They don't have to be - I'm just following what I thought was a  
> standard idiom
> for non-member operator definitions. ('friend' forces the function  
> to be a
> non-member.)

It only needs to be a friend if it has to access the internal data  
structures. Otherwise, it's best just to define these methods outside  
of the class. If they're in the header file, they need to be marked  
"inline".

-bw