<br><br><div class="gmail_quote">On Tue, Sep 4, 2012 at 2:27 PM, Jakob Stoklund Olesen <span dir="ltr"><<a href="mailto:stoklund@2pi.dk" target="_blank">stoklund@2pi.dk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br>

On Sep 4, 2012, at 12:20 PM, Craig Topper <<a href="mailto:craig.topper@gmail.com">craig.topper@gmail.com</a>> wrote:<br>

<br>

> The default implementation of findCommutedOpIndices returns the first two input operands which works for these instructions. We also have precedent in the commuting of SHRD and SHLD instructions which have a third immediate argument. They have partial custom code, but don't change from the default implementation of findCommutedOpIndices.<br>


><br>

> I made them commutable so that TwoAddressInstructionPass and optimizeLoadInstr can better optimize them.<br>

<br>

</div>I see. Thanks.<br>

<div class="im"><br>

> We can still do better here because there are 3 different FMA3 opcodes that vary which operand is the destructive dest and where the load can be folded. Not sure the best way to work some of that into the infrastructure without creating a new pass. I'd appreciate any input you have on that.<br>


<br>

</div>I am not sure how that would work. Preferably, the destructive def should be tied to a use that is also a kill.<br></blockquote><div><br>Right. Effectively the 3 different opcodes provide all possible flavors of commuting of 3 operands. So somewhere we need to decide which of the 6 possible commute choices for 3 operands is appropriate. The existing infrastructure is obviously only designed for 2 possibilities.<br>

 </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class="HOEnZb"><font color="#888888"><br>

/jakob<br>

<br>

</font></span></blockquote></div><br><br clear="all"><br>-- <br>~Craig<br>