<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 13, 2014 at 9:31 PM, Justin Bogner <span dir="ltr"><<a href="mailto:mail@justinbogner.com" target="_blank">mail@justinbogner.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">Chandler Carruth <<a href="mailto:chandlerc@google.com">chandlerc@google.com</a>> writes:<br>

>     CodeGen: When bitfields fall on natural boundaries, split them up<br>

><br>

</div><div class="">> Please revert. We used to do this, and I worked very hard to specifically<br>

> design the system to work the way it does now. There are copious comments,<br>

> email threads, and discussions where we debated how to handle this and decided<br>

> to do so in the exact way.<br>

><br>

> If you want to re-open the discussion, cool. But please do so as a discussion,<br>

> and cite specifically why the rationale for using fully-wide loads and stores<br>

> is inapplicable or causes insurmountable problems.<br>

<br>

</div>Hm, I hadn't really realized the history on this. I guess you're talking<br>

about r169489, "Rework the bitfield access IR generation". Looks to me<br>

like we really did a lot more than just break bitfields up on natural<br>

boundaries.<br></blockquote><div><br></div><div>The old code was written by ddunbar. I talked to him extensively before rewriting most of it. The attempt of the old code was to slice it up into sizeof(int) chunks. There was lots of complexity as well, but that was the primary motivation for the code.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Note that my change here doesn't change the way we lay the bitfields out<br>

*at all*, and that all of the tests added in r169489 pass unchanged with<br>

it.</blockquote><div><br></div><div>Yes, I understand what the change is doing.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I can understand the argument against extra complexity though: I'll<br>


get back to that in a minute.</blockquote></div><br>This isn't an argument against extra complexity, this is the *wrong model*.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Very fundamentally, it is essential that the frontend emit the widest loads and stores permitted by the language spec. The memory model very narrowly constrains the backends ability to widen loads or stores or to merge adjacent loads and stores across control dependencies. By emitting the full load and store at each point, LLVM is able to combine *much* more aggressively around control dependencies without violating the memory model.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">In addition to breaking this theoretical power of the middle-end optimizers, it also effectively masks racing memory accesses to different bitfield slots from tools like ThreadSanitizer. By using the full width in the initial load/store emission, sanitizers are aware of the *potential* domain of any race regardless of what gets dropped during lowering.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">I'll reply separately to the specific performance problems, but please revert this until there is an actual discussion about changing this very fundamental design constraint. =/ This is not "obvious" or something that should go in without review and careful consideration.</div>

</div>