[PATCH] D39976: [AArch64] Consider the cost model when folding loads and stores
Geoff Berry via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 5 11:43:58 PST 2018
gberry added a comment.
I've thought about this some more and tested it out on Falkor. As currently written this change causes SIMD store instructions to not have pre/post increments folded into them, causing minor performance regressions. I have the following general reservations as well:
- does using the max latency of the load/store and add make sense given that the operations are dependent?
- does always favoring latency over number of uops (an approximation of throughput) make sense? unless the operation is on the critical path I would think not.
This combined with the assumptions about multiple uop instructions (which also is not true for Falkor), I would suggest perhaps a better approach would be a add a target-specific property that would allow you to avoid the specific opcodes that are a problem for your target.
Repository:
rL LLVM
https://reviews.llvm.org/D39976
More information about the llvm-commits
mailing list