[PATCH] D124616: [TTI][X86] Fix splat-load cost when load+broadcast cannot be combined.
Valeriy Dmitriev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 2 17:03:05 PDT 2022
vdmitrie added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:1621
+ bool LoadCanBeCombined =
+ L->getType()->isVectorTy() ? L->hasOneUse() : true;
+ if (ST->hasSSE3() && LoadCanBeCombined)
----------------
vdmitrie wrote:
> I'm not following.
> The minimal vector is 2x. The minimal broadcast is 2x.
> As far as I understand if we are broadcasting <2 x double > {a,b} into <4 x double>
> we supposed to get {a,b,a,b}. Right?
>
> we don't have entry for v4f64
> and vmovddup function is {x0,x1,x2,x3} => {x0,x0,x2,x2}
>
> As far as I understand if we are broadcasting <2 x double > {a,b} into <4 x double>
> we supposed to get {a,b,a,b}. Right?
Correcting myself:
<2 x double > {a,b} into <2 x <2 x double>> => {{a,b},{a,b}}
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124616/new/
https://reviews.llvm.org/D124616
More information about the llvm-commits
mailing list