[PATCH] D124616: [TTI][X86] Fix splat-load cost when load+broadcast cannot be combined.

Valeriy Dmitriev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 2 17:03:05 PDT 2022


vdmitrie added inline comments.


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:1621
+      bool LoadCanBeCombined =
+          L->getType()->isVectorTy() ? L->hasOneUse() : true;
+      if (ST->hasSSE3() && LoadCanBeCombined)
----------------
vdmitrie wrote:
> I'm not following.
> The minimal vector is 2x. The minimal broadcast is 2x.
> As far as I understand if we are broadcasting <2 x double > {a,b} into <4 x double>
> we supposed to get {a,b,a,b}. Right?
> 
> we don't have entry for v4f64
> and vmovddup function is {x0,x1,x2,x3} => {x0,x0,x2,x2}
> 

> As far as I understand if we are broadcasting <2 x double > {a,b} into <4 x double>
> we supposed to get {a,b,a,b}. Right?
Correcting myself:
<2 x double > {a,b} into <2 x <2 x double>> => {{a,b},{a,b}}



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124616/new/

https://reviews.llvm.org/D124616



More information about the llvm-commits mailing list