[llvm] [IR][LangRef] Add partial reduction add intrinsic (PR #94499)

Wed Jun 12 08:11:33 PDT 2024

================
@@ -7914,6 +7914,28 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
     setValue(&I, Trunc);
     return;
   }
+  case Intrinsic::experimental_vector_partial_reduce_add: {
+    auto DL = getCurSDLoc();
+    auto ReducedTy = EVT::getEVT(I.getType());
+    auto OpNode = getValue(I.getOperand(1));
+    auto FullTy = OpNode.getValueType();
+
+    auto Accumulator = getValue(I.getOperand(0));
+    unsigned ScaleFactor = FullTy.getVectorMinNumElements() / ReducedTy.getVectorMinNumElements();
+
+    for(unsigned i = 0; i < ScaleFactor; i++) {
----------------
huntergr-arm wrote:

I'm now a bit concerned about the semantics of the intrinsic. In one of the test cases below (partial_reduce_add), you have the same size vector for both inputs. Applying this lowering results in the second vector being reduced and the result added to the first lane of the accumulator, with the other lanes being untouched.

I _think_ the idea was to reduce the second input vector until it matched the size of the first, then perform a vector add of the two. If both are the same size to begin with, you just need to perform a single vector add. @paulwalker-arm can you clarify?

The langref text will need to make the exact semantics clear.

https://github.com/llvm/llvm-project/pull/94499