[PATCH] D124616: [TTI][X86] Fix splat-load cost when load+broadcast cannot be combined.

Mon May 2 14:29:33 PDT 2022

vporpo added inline comments.

================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:1616
+      bool LoadCanBeCombined =
+          L->hasOneUse() && isa<Instruction>(L->user_back());
+      if (ST->hasSSE3() && LoadCanBeCombined)
----------------
dmgreen wrote:
> The user of an instruction should always be an instruction.
> 
> Just to make sure - it isn't a problem for the SLP vectorizer is it? That the load might have multiple uses, which we are turning into a splat?
Oops, I will fix it.

Good point. I did not see any test failures, but I am pretty sure we won't get the right shuffle cost because we are querying the cost model before generating the vector code (which btw is a good reason for testing the actual cost values with something like https://reviews.llvm.org/D124802). 
Hmm not sure what the best approach would be. Perhaps we could go through the users and count that we have at most one vector instruction. In this way SLP will still work as the users will be scalar and TTI still works with regular vector instructions. Any thoughts?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124616/new/

https://reviews.llvm.org/D124616