Updated patch.<div><br></div><div>Now passes all regression tests. Tests for the PR in question have been added.</div><div><br></div><div>I've implemented some really gross logic in the ObjCRuntime part of IR-Gen. Hope it's OK... =/</div>
<div><br></div><div>I'd love to get some feedback on this and commit it soon. I'll be bootstrapping and running it through the test suite when I can.</div><div class="gmail_extra"><br><br><div class="gmail_quote">
On Mon, Aug 27, 2012 at 2:17 AM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hello,<div><br></div><div>I've attached a *draft* patch that is a re-working of our record layout and IR generation for bitfields. The previous strategy inherently introduced widened stores (and loads) that violate the C++11 memory model, causing PR13691. The old system was also rather elaborate without much justification. My patch shrinks the code significantly.</div>
<div><br></div><div>LLVM supports arbitrary sized integers, and bitfields *really are* the equivalent of a bag-of-bits or integer-vector that these integers support well. SRoA has been forming such integers in LLVM for some time. The various targets are in a good position to split and simplify these operations into the efficient representation... and when they fail, we should fix the target. So this patch essentially interpets a run of contiguous bitfields as a single large integer, and emits loads stores and bit operations appropriate for extracting bits from within it. This leaves any valid load widening to LLVM passes that are more target aware and also aware of instrumentation such as ASan or TSan that might be invalidated by load widening.</div>
<div><br></div><div>This has a few advantages:</div><div>- Very dense IR, we can now pre-compute the GEP when building the LValue for a bitfield, and that will be the only GEP needed. Also, the math to extract the value is simpler with this model.</div>
<div>- Maximal exposure to the optimizer of coalescing opportunities across bitfields. The SRoA of the IR produces will end up with a single SSA value for the collection of bits in the bitfield.</div><div><br></div><div>
It also has a two specific disadvantages in its current form, but I think I can address them eventually:</div>
<div>- It doesn't take advantage of unused bits, which can be assumed to be all-zero and save some masking.</div><div>- It also doesn't take advantage of padding following the bitfield to widen stores safely</div>
<div><br></div><div>The last one is very tricky to do correctly. We have to know that the object is POD-for-layout. I want to address this separately w/ some good test cases.</div><div><br></div><div><br></div><div>I hope to send up a cleaned up patch with test cases and results from a bootstrap and maybe test suite run tomorrow, but wanted to give folks an opportunity to comment on the basic strategy.</div>
<div><br></div><div>Somewhat terrifyingly, *all current regression tests pass*. =/</div>
</blockquote></div><br></div>