[PATCH] D158079: [InstCombine] Contracting x^2 + 2*x*y + y^2 to (x + y)^2 (float)
Christoph Stiller via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 16 14:25:46 PDT 2023
rainerzufalldererste added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp:1059
+ m_FMul(m_Deferred(B), m_Deferred(B))))));
+ }
+
----------------
rainerzufalldererste wrote:
> goldstein.w.n wrote:
> > This match code is basically identical to `foldSquareSumInts`. The only difference other than `FMul` vs `Mul` is you do match `FMul(A, 2)` for floats and `m_Shl(A, 1)` for ints.
> > Can you make the match code a helper that takes either fmul/2x matcher (or just lambda wrapping) so it can be used for SumFloat / SumInt?
> Does that imply that `m_c_FAdd` can simply be replaced with `m_c_Add` and will continue to match properly for floating point values as well?
> I presume that would entail partially matching another pattern and then deferring the actual check for the mul2 match, as `BinaryOp_match<RHS, LHS, OpCode>` would have different `OpCode`s for `FMul` and `Shl`, which sounds like a huge mess to me; or is there a cleaner way to do that?
>
> Something like this sadly doesn't compile (as the lambda return type is ambiguous):
> ```
> const auto FpMul2Matcher = [](auto &value) {
> return m_FMul(value, m_SpecificFP(2.0));
> };
> const auto IntMul2Matcher = [](auto &value) {
> return m_Shl(value, m_SpecificInt(1));
> };
> const auto Mul2Matcher = FP ? FpMul2Matcher : IntMul2Matcher;
> ```
Even something like this shouldn't work.
```
template <typename TMul2, typename TCAdd, typename TMul>
static bool MatchesSquareSum(BinaryOperator &I, Value *&A, Value *&B,
const TMul2 &Mul2, const TCAdd &CAdd,
const TMul &Mul) {
// (a * a) + (((a * 2) + b) * b)
bool Matches =
match(&I, CAdd(m_OneUse(Mul(m_Value(A), m_Deferred(A))),
m_OneUse(Mul(CAdd(Mul2(m_Deferred(A)), m_Value(B)),
m_Deferred(B)))));
// ((a * b) * 2) or ((a * 2) * b)
// +
// (a * a + b * b) or (b * b + a * a)
if (!Matches) {
Matches =
match(&I, CAdd(m_CombineOr(m_OneUse(Mul2(Mul(m_Value(A), m_Value(B)))),
m_OneUse(Mul(Mul2(m_Value(A)), m_Value(B)))),
m_OneUse(CAdd(Mul(m_Deferred(A), m_Deferred(A)),
Mul(m_Deferred(B), m_Deferred(B))))));
}
return Matches;
}
```
I agree that it's messy to have duplicate code, but with the way op-codes are used as template parameters I don't see a way without template specialization to do this nicely; and with template specialization it's even more of a beast.
Am I missing some obvious way built into llvm/InstCombine to do this nicely?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158079/new/
https://reviews.llvm.org/D158079
More information about the llvm-commits
mailing list