[llvm] [AArch64] Lower partial add reduction to udot or svdot (PR #101010)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 27 10:41:28 PDT 2024
================
@@ -21229,6 +21249,101 @@ static SDValue tryCombineWhileLo(SDNode *N,
return SDValue(N, 0);
}
+SDValue tryLowerPartialReductionToDot(SDNode *N,
----------------
paulwalker-arm wrote:
I think you're trying to do too much in this one PR, which is making it hard to see if the complexity is required. Please can we start by just handling the cases that have a direct mapping to DOT instructions.
Specifically, please remove the `nxv4i64` result type handling because this seems to be the primary source of complexity. With that gone I think you'll be able to talk purely about EVTs and do less element count based maths. For example, I think you be able to implement the function more akin to:
```
validate input and get pre-extend operands
if ((result_type == nxv4i32 && input_type == nxv16i8)
return getNode(...);
if (result_type == nxv2i64 && input_type == nxv8i16)
return getNode(...);
return SDValue();
```
This will give us a foundation for future PRs to build on (i.e. to support non-legal types, 2-way dot products, usdot and figure out what ISD node semantics we need to best handle the intrinsic).
https://github.com/llvm/llvm-project/pull/101010
More information about the llvm-commits
mailing list