[llvm-commits] [llvm] r170226 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineMulDivRem.cpp test/Transforms/InstCombine/fast-math.ll

Fri Dec 21 04:32:40 PST 2012

Hi Shuxin,

On 17/12/12 18:51, Shuxin Yang wrote:
> Your proposed transformation:
>       X op (select cond A, B) -> select cond (X op A), (X op B)
> could improve performance iff  *BOTH* "X op A" and "X op B" don't need extra
> instructions to evaluate their value.
>
> Other than the "cond ? 1 : 0" expression,  It is hard to imagine real-world
> applications have such opportunities.

sorry, I should have been more explicit.  I was trying to make the following
points:

(1) There is an analogous optimization for almost every operation.  Examples:

   integer multiplication: x * (select cond 1, 0) -> select cond x, 0
   addition: x + select cond 0, -x -> select cond x, 0
   boolean operations: x xor select cond 0, x -> select cond x, 0
   division: (select cond x, 0) / x -> select cond 1, 0
   comparisons: x < select cond x, 0 -> select cond false, true if the sign bit 
of x is set.

and so on.  They can all be done using basically the same code (see below) by
leveraging "instruction simplify".  It would be great if you could implement
them all.  It's easy to do using code like that in point (2) below.  Sadly it
is probably going to mean copying and pasting essentially the same code all
over the place: in the instcombine mul logic, add logic, xor logic etc.

(2) "instruction simplify" is the canonical module for seeing if an operation
simplifies to a constant or existing instruction.  Using it costs nothing, so
why not use it?  There are two big advantages (besides the fact that it is "the
right thing to do"):
  (a) if someone improves the instsimplify fmul logic then this transform
benefits automatically.  For example if someone implements x * (1.0 / x) -> 1.0
when the right fast math flags are set then you immediately get
   x * select cond 1.0 / x, 1.0 -> select cond 1.0, x
and so on.

  (b) it makes for a uniform implementation that can be reused in all of the
cases mentioned in (1).  When looking at, say, the "add" logic, if you see
essentially the same code as you already saw when reading the "mul" logic then
you immediately understand what is going on, i.e. using instsimplify (rather
than custom logic) makes instcombine easier to understand, thus easier to
maintain.

Anyway, here is a possible implementation:

+  Value *Cond, *TV, *FV;
+  if (Op0->hasOneUse())
+    if (match(Op0, m_Select(m_Value(Cond), m_Value(TV), m_Value(FV))))
+      if (Value *NTV = SimplifyFMulInst(TV, Op1, I.getFastMathFlags(), TD))
+        if (Value *NFV = SimplifyFMulInst(FV, Op1, I.getFastMathFlags(), TD)) {
+          SelectInst *SI = cast<SelectInst>(Op0);
+          SI->setTrueValue(NTV);
+          SI->setFalseValue(NFV);
+          return ReplaceInstUsesWith(I, SI);
+        }
+  if (Op1->hasOneUse())
+    if (match(Op1, m_Select(m_Value(Cond), m_Value(TV), m_Value(FV))))
+      if (Value *NTV = SimplifyFMulInst(Op0, TV, I.getFastMathFlags(), TD))
+        if (Value *NFV = SimplifyFMulInst(Op0, FV, I.getFastMathFlags(), TD)) {
+          SelectInst *SI = cast<SelectInst>(Op1);
+          SI->setTrueValue(NTV);
+          SI->setFalseValue(NFV);
+          return ReplaceInstUsesWith(I, SI);
+        }

Changing SimplifyFMulInst to SimplifyMulInst  gives you the integer mul
version; changing it to SimplifyFAddInst gives you the fp add version and so on
(for integer operations you should pass nsw /nuw flags rather than fast math
flags of course).  For commutative operations like FMul the second Op1 version
can easily be removed by first canonicalizing any select instruction to the
left-hand side before doing the Op0 logic.  For non-commutative operations
there might be a clever way to compress the code but I'm not sure it is worth
it.

Note that I added a hasOneUse check and reused the SelectInst rather than not
checking the number of uses and building a new SelectInst, based on the thought
that otherwise this transform could multiply the number of SelectInsts a lot,
though the cost of that maybe isn't high.  What do you think?

Ciao, Duncan.

PS: The above code assumes that this patch has been applied:

--- include/llvm/Instructions.h	(revision 170668)
+++ include/llvm/Instructions.h	(working copy)
@@ -1467,6 +1467,10 @@
    Value *getTrueValue() { return Op<1>(); }
    Value *getFalseValue() { return Op<2>(); }

+  void setCondition(Value *Cond) { setOperand(0, Cond); }
+  void setTrueValue(Value *TVal) { setOperand(1, TVal); }
+  void setFalseValue(Value *FVal) { setOperand(2, FVal); }
+
    /// areInvalidOperands - Return a string if the specified operands are invalid
    /// for a select operation, otherwise return null.
    static const char *areInvalidOperands(Value *Cond, Value *True, Value *False);


>
>
> On 12/16/12 8:49 AM, Duncan Sands wrote:
>> Hi Shuxin,
>>
>> On 14/12/12 19:46, Shuxin Yang wrote:
>>> Author: shuxin_yang
>>> Date: Fri Dec 14 12:46:06 2012
>>> New Revision: 170226
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=170226&view=rev
>>> Log:
>>> rdar://12753946
>>>
>>> Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0"
>>
>> this is a special case of:
>>
>>   X op (select cond A, B) -> select cond (X op A), (X op B)
>>
>> when X op A and X op B simplify (for example because passing X op A and
>> X op B to InstructionSimplify say that they simplify).  Any chance of
>> implementing this more general transform instead?
>>
>> Thanks, Duncan.
>