[PATCH] D82953: [ARM][MVE] Only tail-fold integer add reductions

Thu Jul 2 01:34:05 PDT 2020

samparker added inline comments.

================
Comment at: llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp:1337
+    }
+    if (I->getOpcode() != Instruction::Add) {
+      LLVM_DEBUG(dbgs() << "Only add reductions supported\n");
----------------
dmgreen wrote:
> I took a look at the reduction code in ARMLowOverheadLoops. The vectorizer can create add reductions in a number of places and in a number of ways (it can have multiple reductions, multiple reduction steps, look through phi's and selects and even change the type to a smaller bitwidth). I don't think the backend pass can handle all of them yet.
> 
> Considering how expensive reverting can be in comparison, what do you think of just disabling all reductions for the time being and re-enabling them once we know that things are working well enough? Maybe add a FIXME for it? I would expect D75069 to handle integer reduction in most cases eventually from the vectorizer directly and we can probably do something similar for other reduction types.
In my testing, I couldn't get the vectorizer to produce a tail-folded loops with multiple reductions, so I don't think we have to worry about that at the moment. My basic example also didn't even vectorize a FP reduction loop. I think disabling all reductions would be too conservative now, we should have done it from the beginning but, that now we actually can handle something, I think we should try to have this cost function align with LowOverheadLoops implementation.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82953/new/

https://reviews.llvm.org/D82953