<p dir="ltr">Thanks. </p>

<p dir="ltr">We probably shouldn't block improving clang on the potential for optimizing the existing code. </p>

<p dir="ltr">How about committing you patch and opening bugs on llvm and clang?</p>

<div class="gmail_quote">On Apr 20, 2015 8:36 PM, "Pete Cooper" <<a href="mailto:peter_cooper@apple.com">peter_cooper@apple.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

> On Apr 20, 2015, at 5:25 PM, Matthias Braun <<a href="mailto:mbraun@apple.com">mbraun@apple.com</a>> wrote:<br>

><br>

><br>

>> On Apr 20, 2015, at 4:10 PM, Pete Cooper <<a href="mailto:peter_cooper@apple.com">peter_cooper@apple.com</a>> wrote:<br>

>><br>

>> Hi Rafael<br>

>><br>

>> MachineOperand::OpKind is an unsigned char, followed by 24-bits of bitfields.  clang is able to make all of that take 32-bits of space, but it generates pretty nasty code to do so.<br>

>><br>

>> OpKind itself is loaded and stored as an i8 and so gets very nice IR.  However, loading an entry from one of the bitfields results in a load of an i24, then masking operations.  Given that i24 isn’t legal, the legalizer goes to great lengths to make this work, typically splitting it in to i8 and i16 operations.<br>

> I don't know the actual clang code, but isn't the canonical codegen for something like the following "unsigned SubReg_TargetFlags : 12;" to load an "unsigned" value and shift/mask that? Why does it use i24?<br>

Attached code to demonstrate this.  Here’s a reduced version of MachineOperand compiled with and without the change.<br>

<br>

To answer your question, clang has decided that the 24 bits total of bitfields should be represented as a [3 x i8].  So to read one of the bitfields, it casts the [3 x i8]* to i24* then does a load.  I guess it just wants to always load and store the whole bitfield when operating on any one piece of it.<br>

<br>

I’m not sure if this is a requirement of C++ or not.  Personally, i’d like clang to use powers of 2 which are a multiple of 8 (i.e., 8, 16, 32, 64) for bitfield operations in the IR, but i don’t know if it can or not.<br>

<br>

Pete<br>

<br>

<br><br>

><br>

> - Matthias<br>

<br>

<br></blockquote></div>