[PATCH] D80175: [PowerPC][MachineCombiner] reassociate fma to expose more ILP
ChenZheng via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 27 18:02:24 PDT 2020
shchenz marked an inline comment as done.
shchenz added a comment.
In D80175#2058335 <https://reviews.llvm.org/D80175#2058335>, @spatel wrote:
> I did not look at the patch itself except to notice that it is a lot of code...so I have to ask - did you look at implementing at least the 1st pattern in DAGCombiner? That seems like a general improvement for any superscalar micro-arch with no register pressure disadvantage.
Thanks for looking into this @spatel
Yes, I tried to implement pattern 1 in DAGCombiner, but I got some LIT failures related to register allocation on platform AArch64 and Thumb2. And this kind of opt will increase register pressure. I think it is better not to add it in DAGCombiner.
Reason I add these two patterns in MachineCombiner is:
1: This pass is targeted for ILP related optimization
2: Adding register pressure estimation model here should be easy than in DAGCombiner. We can do similar estimation like we did in MachineLICM if we want to model it in future?
3: These two patterns have to be put together. After breaking pattern 2: (fma+fma+fma) to (fmul+fma+fma+fadd), the last `fadd` can be combined with following two fmas as pattern 1, and we can get more paralleled fmas.
I agree that this can be exploited to other platforms that support destructive hardware fma instructions. But I am not familiar with other platform's instruction set, so currently I only implement it on PowerPC.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:313-319
+// A = FADD X, Y (Leaf)
+// B = FMA A, M21, M22 (Prev)
+// C = FMA B, M31, M32 (Root)
+// -->
+// A = FMA X, M21, M22
+// B = FMA Y, M31, M32
+// C = FADD A, B
----------------
spatel wrote:
> I was confused here because I was expecting the C++ style notation for FMA (X*Y+Z):
> https://en.cppreference.com/w/cpp/numeric/math/fma
This comment is target-specific. On PowerPC, most fma like instructions such as xsmaddadp/xsmaddasp/xvmaddadp/xvmaddasp are defined with the above form in ISA.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80175/new/
https://reviews.llvm.org/D80175
More information about the llvm-commits
mailing list