[PATCH] D111938: [TTI][X86] Add SSE2 sub-128bit vXi16/32 and v2i64 stride 2 interleaved load costs
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Oct 16 08:11:45 PDT 2021
RKSimon added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:5224
- {2, MVT::v2i16, 2}, // (load 4i16 and) deinterleave into 2 x 2i16
- {2, MVT::v4i16, 2}, // (load 8i16 and) deinterleave into 2 x 4i16
{2, MVT::v8i16, 6}, // (load 16i16 and) deinterleave into 2 x 8i16
----------------
lebedev.ri wrote:
> Looking at `llvm-project/llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-2.ll`,
> VF4 codegen is really different between SSE2 and AVX2.
nice catch!
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:5230
- {2, MVT::v2i32, 2}, // (load 4i32 and) deinterleave into 2 x 2i32
{2, MVT::v4i32, 2}, // (load 8i32 and) deinterleave into 2 x 4i32
{2, MVT::v8i32, 4}, // (load 16i32 and) deinterleave into 2 x 8i32
----------------
lebedev.ri wrote:
> Looking at `llvm-project/llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-2.ll`,
> `@load_i32_stride2_vf4` also seems to match.
every little helps :)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D111938/new/
https://reviews.llvm.org/D111938
More information about the llvm-commits
mailing list