[PATCH] D80175: [PowerPC][MachineCombiner] reassociate fma to expose more ILP

ChenZheng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 27 18:02:24 PDT 2020


shchenz marked an inline comment as done.
shchenz added a comment.

In D80175#2058335 <https://reviews.llvm.org/D80175#2058335>, @spatel wrote:

> I did not look at the patch itself except to notice that it is a lot of code...so I have to ask - did you look at implementing at least the 1st pattern in DAGCombiner? That seems like a general improvement for any superscalar micro-arch with no register pressure disadvantage.


Thanks for looking into this @spatel

Yes, I tried to implement pattern 1 in DAGCombiner, but I got some LIT failures related to register allocation on platform AArch64 and Thumb2. And this kind of opt will increase register pressure. I think it is better not to add it in DAGCombiner.
Reason I add these two patterns in MachineCombiner is:
1: This pass is targeted for ILP related optimization
2: Adding register pressure estimation model here should be easy than in DAGCombiner. We can do similar estimation like we did in MachineLICM if we want to model it in future?
3: These two patterns have to be put together. After breaking pattern 2: (fma+fma+fma) to (fmul+fma+fma+fadd), the last `fadd` can be combined with following two fmas as pattern 1, and we can get more paralleled fmas.

I agree that this can be exploited to other platforms that support destructive hardware fma instructions. But I am not familiar with other platform's instruction set, so currently I only implement it on PowerPC.



================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:313-319
+//   A =  FADD X,  Y          (Leaf)
+//   B =  FMA  A,  M21,  M22  (Prev)
+//   C =  FMA  B,  M31,  M32  (Root)
+// -->
+//   A =  FMA  X,  M21,  M22
+//   B =  FMA  Y,  M31,  M32
+//   C =  FADD A,  B
----------------
spatel wrote:
> I was confused here because I was expecting the C++ style notation for FMA (X*Y+Z):
> https://en.cppreference.com/w/cpp/numeric/math/fma
This comment is target-specific. On PowerPC, most fma like instructions such as xsmaddadp/xsmaddasp/xvmaddadp/xvmaddasp are defined with the above form in ISA.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80175/new/

https://reviews.llvm.org/D80175





More information about the llvm-commits mailing list