[PATCH] D25350: [X86] Enable interleaved memory accesses by default
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 10 08:24:53 PDT 2016
mkuper added a comment.
In https://reviews.llvm.org/D25350#566175, @RKSimon wrote:
> Any performance test gains/regressions?
We've had performance gains on internal benchmarks. (And, anecdotally, a performance regression, which turned out to be a gain - we've started vectorizing a loop we were not vectorizing before, and really should have better performance when vectorized - but the dynamic loop count on that loop tends to be 1 or 2...)
As far as I know Intel also had performance gains on the benchmarks they run, but I'll let Ayal/Elena/Dorit speak for themselves.
As to public benchmarks - I don't have an up-to-date SPEC run with this unfortunately. I can run one if you want - unless Intel already happen to have the results handy?
Anyway, if you have internal benchmarks you want to run this on pre-commit, please do - codegen isn't necessarily happy with the shuffle sequences this generates. We've had some really bad regressions initially, which led to r283480. I don't expect any more big surprises, since the x86 cost model is still *really* conservative w.r.t interleaved memory accesses (teaching it to be more precise is a separate issue, and probably ties in with Farhana's work in https://reviews.llvm.org/D24681), but who knows.
https://reviews.llvm.org/D25350
More information about the llvm-commits
mailing list