[PATCH] D24289: Add warning when assigning enums to bitfields without an explicit unsigned underlying type

Thu Nov 17 12:48:07 PST 2016

On Thu, Nov 17, 2016 at 9:52 AM, Richard Smith via cfe-commits <
cfe-commits at lists.llvm.org> wrote:
> On 17 Nov 2016 8:56 am, "Reid Kleckner" <rnk at google.com> wrote:
>> In https://reviews.llvm.org/D24289#598169, @rsmith wrote:
>>> This is causing warnings to fire for headers shared between C and C++,
>>> where the "give the enum an unsigned underlying type" advice doesn't
work,
>>> and where the code in question will never be built for the MS ABI. It
seems
>>> really hard to justify this being on by default.
>>>
>>> I'm going to turn it off by default for now, but we should probably
>>> consider turning it back on by default when targeting the MS ABI (as a
"your
>>> code is wrong" warning rather than a "your code is not portable"
warning).
[...]
>>> Yeah, suggesting adding an underlying type to the enum to solve this
>>> problem seems like a bad idea, since that fundamentally changes the
nature
>>> of the enum -- typically allowing it to store a lot more values, and
making
>>> putting it in a bitfield a bad idea.
>
>> Any time you use a bitfield it stores fewer values than the original
integer
>> type. I don't see how enums are special here. [...]
>
> The range of representable values for a bitfield with no fixed underlying
> type is actually smaller than that of its underlying type. See
> http://eel.is/c++draft/dcl.enum#8
>
> So a bitfield of width equal to the number of bits needed to store any
> enumerator does not have fewer values than the original type.

My understanding (from osmosis and practice more than from reading the
standard) is that programmers are more likely to specify an "unnaturally
narrow" underlying type (e.g. "int8_t") than to specify an "unnaturally
wide" underlying type (e.g. "int32_t". When I specify an underlying type,
I'm saying "The compiler is going to do the wrong thing with this type's
*storage* by default"; I'm not saying anything about the type's *value
range*.
The same goes for bit-fields: I specify a number of bits after the colon
because the compiler would otherwise do the wrong thing with *storage*, not
because I'm trying to change the semantics of the *values* involved.

>> Do you have any better suggestions for people that want this code to do
the
>> right thing when built with MSVC?
>>
>>   enum E /* : unsigned */ { E0, E1, E2, E3 };
>>   struct A { E b : 2; };
>>   int main() {
>>     A a;
>>     a.b = E3;
>>     return a.b; // comes out as -1 without the underlying type
>>   }
>>
>> Widening the bitfield wastes a bit. Making the bitfield a plain integer
and
>> cast in and out of the enum type, but that obscures the true type of the
>> bitfield. So, I still support this suggestion.

The safest fix is to just change ": 2" to ": 3", even though that "wastes a
bit" (really it wastes 0 bits in most cases and 32 to 64 bits in some
cases).

If I had my druthers, the compiler would be completely silent unless it
detected exactly the cases that would result in changes to the semantics of
enum *values*. That is,

    when declaring a bit-field of enum type E with width B bits:
      if E has an enumerator e whose value requires >B bits:
        warn that e cannot be stored in the bit-field
        if a fixit is really required, suggest increasing the bit-field's
width
      if E has an enumerator e whose positive value requires exactly B
bits, and E's underlying type is signed:
        warn that e cannot be stored in the bit-field when MSVC semantics
are in use
        in C++, append the note that this happens because E's underlying
type is signed
        if a fixit is really required, suggest increasing the bit-field's
width

Changing the width of the bit-field can affect only the layout of the
struct in question; you could even static_assert that the size *doesn't*
change, if there are padding bits available.
Changing a whole type from signed to unsigned seems like it would have more
dangerous action-at-a-distance than just increasing the width of the
bit-field: that would *unavoidably* affect comparisons (and overload
resolutions?) across the entire codebase using that type.
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#172
Again I'm making a philosophical distinction between the details of storage
(e.g. struct layout) and the actual semantics of values and types (e.g.
signedness).

The problem is with the bit-field inside the struct, so IMHO the solution
should involve changing the bit-field and/or the struct. Not altering the
enum type... which by the way, the programmer might not even control,
right? What if the relevant enum type comes from a third-party library?

> How about we consider msvc's behaviour to just be a bug, and zero-extend
> loads from enum bitfields with no negative enumerators? Since loading
such a
> negative value results in UB, this wouldn't change the meaning of any
> program with defined behaviour, and still respects the MS ABI.

SGTM, at least to put this behavior under a separately toggleable
command-line switch. (-fms-enum-bitfields?)

my $.02,
–Arthur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20161117/3bccb5a5/attachment-0001.html>