[PATCH] D100745: [AArch64] Add AArch64TTIImpl::getMaskedMemoryOpCost function

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 20 01:58:31 PDT 2021


david-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:928
+    if (auto *VecTy = dyn_cast<FixedVectorType>(Src))
+      ScalarCost *= VecTy->getNumElements();
+    return ScalarCost;
----------------
sdesmalen wrote:
> Should this actually be something to add to BasicTTIImpl.h so that it can be reused for other targets? The cost of implementing a masked memory op would be: `NumElts * (cost(load element) + cost(insert element)) <=> NumElts * cost(load element) + ScalarizationOverhead`
> 
> Then we only have to implement this function for the scalable case, and all other cases can call the BasicTTIImpl.
I don't think BasicTTIImpl currently has getMaskedMemoryOpCost, but I can look into adding one if we think it's worthwhile? If so, we'd almost certainly want to update the ARM target too, because it does something very similar.

Does that cost you mention above take into account the compare and branch? I'd expect something like:

%1 = icmp
br i1 %1, ...
%2 = load
...
%3 = insertlement .... %2

I think it's important to reflect the cost of the branch here, since that's something the vector version wouldn't have.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100745/new/

https://reviews.llvm.org/D100745



More information about the llvm-commits mailing list