r215614 - CodeGen: When bitfields fall on natural boundaries, split them up

Thu Aug 14 08:55:19 PDT 2014

Chandler Carruth <chandlerc at google.com> writes:
> This isn't an argument against extra complexity, this is the *wrong model*.
>
> Very fundamentally, it is essential that the frontend emit the widest loads
> and stores permitted by the language spec. The memory model very narrowly
> constrains the backends ability to widen loads or stores or to merge adjacent
> loads and stores across control dependencies. By emitting the full load and
> store at each point, LLVM is able to combine *much* more aggressively around
> control dependencies without violating the memory model.
>
> In addition to breaking this theoretical power of the middle-end optimizers,
> it also effectively masks racing memory accesses to different bitfield slots
> from tools like ThreadSanitizer. By using the full width in the initial load/
> store emission, sanitizers are aware of the *potential* domain of any race
> regardless of what gets dropped during lowering.
>
> I'll reply separately to the specific performance problems, but please revert
> this until there is an actual discussion about changing this very fundamental
> design constraint. =/ This is not "obvious" or something that should go in
> without review and careful consideration.

Thanks for the detailed explanation. I appreciate the context on
this. Reverted in r215648 and I'll take a look at teaching the arm
backend to narrow this kind of load more effectively when I get a
chance.