[llvm-commits] [PATCH][FastMath, InstCombine] Fadd/Fsub optimizations

Eli Friedman eli.friedman at gmail.com
Mon Dec 17 15:49:51 PST 2012


On Fri, Dec 14, 2012 at 10:45 AM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>   I'd like to resurrect my previous change. See the attachment. It was
> revised per
> your requests. In addition to that, I change the Coeffficient class a bit
> such
> that the cost of default  constructor is merely a 4-byte-store-zero. This
> cost
> renders it possible to "abuse" this data structure.
>
>   As to the determinstic sorting, I have not yet figure out a cost effective
> way
> for that matter. Note that stable_sort() won't help as all Addends sharing
> the same symbolic value will be folded, we don't care their relative
> posittion
> before folding.

We discussed this in person; it sounded like you have a viable solution.

>  Tons of thanks in advance for the code review. (testing-case is not
> included in this
> patch for clarity purpose)

General comment: please try and write as many tests as possible.
Fast-math is generally a new area, so there's very little existing
coverage to depend on.  In particular. please make sure all the
variants of "(x+y) + (y-x)" etc. we discussed are there.

+  I.setHasUnsafeAlgebra(true);
+  if (I.hasUnsafeAlgebra()) {
+    if (Value *V = FAddCombine(Builder).simplify(&I))
+      return ReplaceInstUsesWith(I, V);
+  }

Please remember to remove the I.setHasUnsafeAlgebra for the final patch.

+    bool IsFp;
+
+    // True iff FpValBuf contains a instance of APFloat.
+    bool BufHasFpVal;

How are these different?

+    union {
+      char FpValBuf[sizeof(APFloat)];
+      int dummy; // So this structure has at least 4-byte alignment.
+    };

llvm::AlignedCharArrayUnion.  And I'm assuming you benchmarked and
found this was worth the complexity.

+  // Currently flush-to-0 is ignored. Following stmtement is to suppress
+  // compile-warning.
+  FlushToZero = !FlushToZero;

We generally prefer the pattern "(void)FlushToZero;".

It doesn't look like you addressed my earlier comment about
FAddend::drillDownOneStep.

+    Value *V0 = I->getOperand(0);
+    Value *V1 = I->getOperand(1);
+    InstQuota = ((!isa<Constant>(V0) && V0->hasOneUse()) &&
+                 (!isa<Constant>(V1) && V1->hasOneUse())) ? 2 : 1;

I'm not sure what your logic is here; can you add a comment to explain?

+unsigned FAddend::drillDownOneStep
+  (Value *Val, FAddend &Addend0, FAddend &Addend1) {
+  Instruction *I = 0;
+  if (Val == 0 || !(I = dyn_cast<Instruction>(Val)))
+    return 0;

I'm not sure what you're trying to do with this check.

+  // step 4: Try to optimize Opnd0 + Opnd1_0 [+ Opnd1_1]
+  if (Opnd1_ExpNum) {
+    AddendVect AllOpnds;
+    AllOpnds.push_back(&Opnd0);
+    AllOpnds.push_back(&Opnd1_0);
+    if (Opnd1_ExpNum == 2)
+      AllOpnds.push_back(&Opnd1_1);
+
+    if (Value *R = simplifyFAdd(AllOpnds, 1))
+      return R;
+  }

This seems redundant: why wouldn't it be handled in "step 3"?  Maybe
the way drillDownOneStep works should be changed a little?

-Eli



More information about the llvm-commits mailing list