[PATCH] D100684: [X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 12 05:13:40 PDT 2021
lebedev.ri added a comment.
@RKSimon i'm in need of a bit of guidance.
I'd like to maybe deal with `getInterleavedMemoryOpCostAVX2()` next, but i'm not sure what's the best way forward.
After thinking about it, it'm iffy about just adding more hardcoded entries to the costtable there.
We have element {i8, i16, i32, i64} * stride {2..6} * VF {8..64}. That's 64 entries already, by naive estimates.
This ignores partial strided loads (with Indices.size() != stride), and other vector widths.
Those will cause a basically exponential explosion.
Do we really want to proceed on that path?
I'm seeing two alternatives:
1. Perhaps we should try to come up with an algorithmic approach, like we have here?
2. Perhaps we should simply automate this? Run the strided load pattern through codegen, run that through exegesis, and automatically record it's performance?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100684/new/
https://reviews.llvm.org/D100684
More information about the llvm-commits
mailing list