[clang] [clang] Better bitfield access units (PR #65742)

Fri Sep 8 09:53:46 PDT 2023

efriedma-quic wrote:

The primary issue with over-wide bitfield accesses is ensuring we don't cause data races.  C++ [intro.memory]p3: "A memory location is either an object of scalar type that is not a bit-field or a maximal sequence of adjacent bit-fields all having nonzero width.  Two or more threads of execution can access separate memory locations without interfering with each other."  So that imposes an upper limit: we can't store any byte that might overlap another field.

There isn't really a lower limit: the ABI rule is that we access as many bytes as possible, but we aren't actually obligated to access those bytes; accesses that don't actually read or write a field aren't observable.

The advantage of exposing the wide accesses to the optimizer is that it allows memory optimizations, like CSE or DSE, to reason about the entire bitfield as a single unit, and easily eliminate accesses.  Reducing the size of bitfield accesses in the frontend is basically throwing away information.

The disadvantage is, as you note here, that sometimes codegen doesn't manage to clean up the resulting code well.

I guess my primary question here is, instead of making clang try to guess which accesses will be optimal, can we improve the way the LLVM code generator handles the patterns currently generated by clang?  I'm not exactly against changing the IR generated by clang, but changing the IR like this seems likely to improve some cases at the expense of others.

https://github.com/llvm/llvm-project/pull/65742