[PATCH] D22456: [X86][SSE] Add cost model values for CTPOP of vectors
Sean Silva via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 18 14:46:59 PDT 2016
silvas added a comment.
Is the plan to make these costs also dependent on host CPU? For example, IIRC the vector ctpop lowerings have serially dependent pshufb's which are 1 cycle latency on big intel cores but 4 cycle latency on Jaguar according to Agner.
Also, on Jaguar scalar popcnt is "as cheap as an add" but on e.g. Skylake scalar popcnt has 4x less throughput than an add and 3x higher latency.
Repository:
rL LLVM
https://reviews.llvm.org/D22456
More information about the llvm-commits
mailing list