<div dir="ltr"><div dir="ltr"></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 29 May 2020 at 11:06, John McCall via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 28 May 2020, at 18:42, Bill Wendling wrote:<br>

<br>

> On Tue, May 26, 2020 at 7:49 PM James Y Knight via llvm-dev<br>

> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>

>><br>

>> At least in this test-case, the "bitfield" part of this seems to be a <br>

>> distraction. As Eli notes, Clang has lowered the function to LLVM IR <br>

>> containing consistent i16 operations. Despite that being a different <br>

>> choice from GCC, it should still be correct and consistent.<br>

>><br>

> I suspect that this is more prevalent with bitfields as they're more<br>

> likely to have the load / bitwise op / store operations done on them,<br>

> resulting in an access type that can be shortened. But yes, it's not<br>

> specific to just bitfields.<br>

><br>

> I'm more interested in consistency, to be honest. If the loads and<br>

> stores for the bitfields (or other such shorten-able objects) were the<br>

> same, then we wouldn't run into the store-to-load forwarding issue on<br>

> x86 (I don't know about other platforms, but suspect that consistency<br>

> wouldn't hurt). I liked Arthur's idea of accessing the object using<br>

> the type size the bitfield was defined with (i8, i16, i256). It would<br>

> help with improving the heuristic. The downside is that it could lead<br>

> to un-optimal code, but that's the situation we have now, so...<br>

<br>

Okay, but what concretely are you suggesting here?  Clang IRGen is<br>

emitting accesses with consistent sizes, and LLVM is making them<br>

inconsistent.  Are you just asking Clang to emit smaller accesses<br>

in the hope that LLVM won’t mess them up?<br></blockquote><div><br></div><div>I don't think this has anything to do with bit-fields or Clang's lowering. This seems to "just" be an optimizer issue (one that happens to show up for bit-field access patterns, but also affects other cases). Much-reduced testcase:</div><div><br></div>unsigned short n;<br><div>void set() { n |= 1; }</div><div><br></div><div>For this testcase, -O2 generates a 1-byte 'or' instruction, <a href="http://quick-bench.com/e61y0Wn1qR-9K1YM6Bf9YoS6qfY">which will often be a pessimization</a> when there are also full-width accesses. I don't think the frontend can or should be working around this.</div></div></div>