[cfe-commits] PATCH: Large re-working of bitfield IR-gen, and a fix for PR13691

John McCall rjmccall at apple.com
Tue Sep 11 11:33:55 PDT 2012


On Sep 11, 2012, at 11:07 AM, Chandler Carruth wrote:
> On Fri, Aug 31, 2012 at 1:01 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, Aug 27, 2012 at 10:33 AM, John McCall <rjmccall at apple.com> wrote:
> > On Aug 27, 2012, at 9:55 AM, Matthieu Monrocq wrote:
> >
> > On Mon, Aug 27, 2012 at 11:17 AM, Chandler Carruth <chandlerc at google.com>
> > wrote:
> >>
> >> Hello,
> >>
> >> I've attached a *draft* patch that is a re-working of our record layout
> >> and IR generation for bitfields. The previous strategy inherently introduced
> >> widened stores (and loads) that violate the C++11 memory model, causing
> >> PR13691. The old system was also rather elaborate without much
> >> justification. My patch shrinks the code significantly.
> >>
> >> LLVM supports arbitrary sized integers, and bitfields *really are* the
> >> equivalent of a bag-of-bits or integer-vector that these integers support
> >> well. SRoA has been forming such integers in LLVM for some time. The various
> >> targets are in a good position to split and simplify these operations into
> >> the efficient representation... and when they fail, we should fix the
> >> target. So this patch essentially interpets a run of contiguous bitfields as
> >> a single large integer, and emits loads stores and bit operations
> >> appropriate for extracting bits from within it. This leaves any valid load
> >> widening to LLVM passes that are more target aware and also aware of
> >> instrumentation such as ASan or TSan that might be invalidated by load
> >> widening.
> >>
> >> This has a few advantages:
> >> - Very dense IR, we can now pre-compute the GEP when building the LValue
> >> for a bitfield, and that will be the only GEP needed. Also, the math to
> >> extract the value is simpler with this model.
> >> - Maximal exposure to the optimizer of coalescing opportunities across
> >> bitfields. The SRoA of the IR produces will end up with a single SSA value
> >> for the collection of bits in the bitfield.
> >>
> >> It also has a two specific disadvantages in its current form, but I think
> >> I can address them eventually:
> >> - It doesn't take advantage of unused bits, which can be assumed to be
> >> all-zero and save some masking.
> >> - It also doesn't take advantage of padding following the bitfield to
> >> widen stores safely
> >>
> >> The last one is very tricky to do correctly. We have to know that the
> >> object is POD-for-layout. I want to address this separately w/ some good
> >> test cases.
> >>
> >
> > Just a reminder here, a derived class may use the padding of its base and
> > stuff things there.
> >
> >
> > Only if it's not POD, which I assume is why Chandler brought that up.
> 
> Another nasty case I just thought of:
> 
> struct x { int i : 24; };
> struct y { int i : 24; char j; };
> union z {
>   struct x x;
>   struct y y;
> };
> union z a;
> void f(int i) {
>   a.x.i = i;
> }
> void g(char j) {
>   a.y.j = j
> }
> 
> The two writes are to separate memory locations. :)
> 
> Wait, hold on... I'm deeply confused. Maybe because I don't know how C11 unions work?
> 
> With C++11, only one side of the union is allowed to be active, and so I don't think they are separate memory locations?

I agree that this isn't a problem, but the analysis is a bit more complicated;
it hinges on the fact that it's okay to *read* from an inactive union member
under the common-prefix exception, but it's not okay to *write* to it.  The
same analysis applies in both standards:

C++11 [class.mem]p18:
  If a standard-layout union contains two or more standard-layout
  structs that share a common initial sequence, and if the standard-layout
  union object currently contains one of these standard-layout structs, it
  is permitted to inspect the common initial part of any of them. Two
  standard-layout structs share a common initial sequence if corresponding
  members have layout-compatible types and either neither member is a
  bit-field or both are bit-fields with the same width for a sequence of one
  or more initial members.

C11 6.5.2.3p6:
  One special guarantee is made in order to simplify the use of unions: if
  a union contains several structures that share a common initial sequence
  (see below), and if the union object currently contains one of these structures,
  it is permitted to inspect the common initial part of any of them anywhere
  that a declaration of the completed type of the union is visible. Two
  structures share a common initial sequence if corresponding members
  have compatible types (and, for bit-fields, the same widths) for a sequence
  of one or more initial members.

John.



More information about the cfe-commits mailing list