[PATCH] D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's

Thu Dec 7 13:02:29 PST 2017

MatzeB added inline comments.

================
Comment at: test/CodeGen/AArch64/fma-aggressive.ll:323
+
+attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="true" "no-jump-tables"="false" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="thunderx2t99" "target-features"="+lse,+neon,+v8.1a" "unsafe-fp-math"="true" "use-soft-float"="false" }
+attributes #1 = { argmemonly nounwind }
----------------
fhahn wrote:
> steleman wrote:
> > fhahn wrote:
> > > steleman wrote:
> > > > fhahn wrote:
> > > > > steleman wrote:
> > > > > > fhahn wrote:
> > > > > > > Can we drop the meta data from here to the end or is it required for the test?
> > > > > > I'll check and if it works without the metadata, I'll remove it.
> > > > > Great thanks. On second look, I think there is no need for the test case to be that complicated, for example, do we need all those global variables, function calls, ifs, loops? AFAIK DagCombine only looks at nodes and uses and does not have a cost model tied to loop iterations. 
> > > > > 
> > > > > Unless I am missing something, it should be enough to have a couple of basic blocks with fadd/fmul instructions to test this, e.g. fadd/fmul with multiple users, or the more complex patterns guarded by `if (Aggressive)` in `visitFADDForFMACombine`.
> > > > > 
> > > > > A much simpler test case will make it easier to debug if it fails in the future and also easier to understand when reviewing if it does the right thing.
> > > > 
> > > > It's not as simple as you think.
> > > > 
> > > > DAGCombiner.cpp:9636
> > > > 
> > > > 
> > > > ```
> > > >   // fold (fmul (fadd x, +1.0), y) -> (fma x, y, y)
> > > >   // fold (fmul (fadd x, -1.0), y) -> (fma x, y, (fneg y))
> > > >   auto FuseFADD = [&](SDValue X, SDValue Y) {
> > > >     if (X.getOpcode() == ISD::FADD && (Aggressive || X->hasOneUse())) {
> > > >       auto XC1 = isConstOrConstSplatFP(X.getOperand(1));
> > > >       if (XC1 && XC1->isExactlyValue(+1.0))
> > > >         return DAG.getNode(PreferredFusedOpcode, SL, VT, X.getOperand(0), Y, Y);
> > > >       if (XC1 && XC1->isExactlyValue(-1.0))
> > > >         return DAG.getNode(PreferredFusedOpcode, SL, VT, X.getOperand(0), Y,
> > > >                            DAG.getNode(ISD::FNEG, SL, VT, Y));
> > > >     }
> > > >     return SDValue();
> > > >   };
> > > > 
> > > > ```
> > > > 
> > > > Note:
> > > > 
> > > > 
> > > > ```
> > > > if (X.getOpcode() == ISD::FADD && (Aggressive || X->hasOneUse()))
> > > > ```
> > > > 
> > > > What we want to do here is make sure that X has more than one use. I.e. X->hasOneUse() evaluates to false. That's the only way of testing that FMA happens when Aggressive is true.
> > > > 
> > > > In order to make sure that X ends up having more than one use, the code needs to be complicated.
> > > > 
> > > > This code isn't lifted from some existing program. I wrote it specifically for this purpose. If there was no need for these loops, ifs, global variables, etc, I wouldn't have written them.
> > > > 
> > > > 
> > > You do not need complicated code to have get multiple uses. In the function below, %mul has 2 uses and will only be combined by DAGCombine, if aggressive FMA fusion is enabled (that's the pattern defined in DAGCombiner.cpp:9099)
> > > 
> > > 
> > > ```
> > > define double @test(double %x, double %y, double %z) {
> > >   %mul = fmul fast double %x, %y
> > >   %add = fadd fast double %mul, %z
> > >   %use2 = fdiv fast double %mul, %add
> > >   ret double %use2
> > > }
> > > ```
> > 
> > I will be happy to include your test case in a separate, and different test for Aggressive FMA, but I do not see how your test demonstrates Aggressive FMA as triggered by the compilation of a real program, written in a language that software developers realistically write software in.
> > 
> > Also: your test does not provide any method of validating the results produced by enabling Aggressive FMA. Nor does it provide an easy method of detecting that some potential future changes have introduced a regression.
> > 
> > For Aggressive FMA, I believe that it is important to have a real-life program that can be compiled and run, and whose results can be compared against the results produced by a different compiler, preferably on a different architecture.
> > 
> > 
> > Also: your test does not provide any method of validating the results produced by enabling Aggressive FMA. Nor does it provide an easy method of detecting that some potential future changes have introduced a regression.
> 
> Sorry, I just intended to highlight how to create a function that tests an isolated pattern that is triggered by aggressive FMA. Ideally we would have similar functions for all important patterns. Then the test should catch all relevant regressions in DAGCombine with aggressive FMA.
> 
> > For Aggressive FMA, I believe that it is important to have a real-life program that can be compiled and run, and whose results can be compared against the results produced by a different compiler, preferably on a different architecture.
> 
> Agreed! But for that, the LLVM test-suite is probably a better place to make sure there are no performance regressions as it uses Clang directly, like users would, see http://llvm.org/docs/TestingGuide.html .
> 
> `llvm/test` should contain small, isolated regression tests.  I see why it is tempting to add a big test case like that to llvm/test, but 1) it still only guards against changes in Codegen, but opt or clang could mess up the IR to begin with and 2) it makes the regression test suite more fragile than it needs to be.
> 
> If others are happy with the big test, I won't object though
Yes bigger tests are better of in the test-suite; you summed up llvms testing strategy nicely!

Repository:
  rL LLVM

https://reviews.llvm.org/D40696