[PATCH] D111938: [TTI][X86] Add SSE2 sub-128bit vXi16/32 and v2i64 stride 2 interleaved load costs
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Oct 16 07:24:58 PDT 2021
lebedev.ri added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:5224
- {2, MVT::v2i16, 2}, // (load 4i16 and) deinterleave into 2 x 2i16
- {2, MVT::v4i16, 2}, // (load 8i16 and) deinterleave into 2 x 4i16
{2, MVT::v8i16, 6}, // (load 16i16 and) deinterleave into 2 x 8i16
----------------
Looking at `llvm-project/llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-2.ll`,
VF4 codegen is really different between SSE2 and AVX2.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:5230
- {2, MVT::v2i32, 2}, // (load 4i32 and) deinterleave into 2 x 2i32
{2, MVT::v4i32, 2}, // (load 8i32 and) deinterleave into 2 x 4i32
{2, MVT::v8i32, 4}, // (load 16i32 and) deinterleave into 2 x 8i32
----------------
Looking at `llvm-project/llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-2.ll`,
`@load_i32_stride2_vf4` also seems to match.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D111938/new/
https://reviews.llvm.org/D111938
More information about the llvm-commits
mailing list