[PATCH] D39976: [AArch64] Consider the cost model when folding loads and stores

Mon Feb 5 11:43:58 PST 2018

gberry added a comment.

I've thought about this some more and tested it out on Falkor.  As currently written this change causes SIMD store instructions to not have pre/post increments folded into them, causing minor performance regressions.  I have the following general reservations as well:

- does using the max latency of the load/store and add make sense given that the operations are dependent?
- does always favoring latency over number of uops (an approximation of throughput) make sense?  unless the operation is on the critical path I would think not.

This combined with the assumptions about multiple uop instructions (which also is not true for Falkor), I would suggest perhaps a better approach would be a add a target-specific property that would allow you to avoid the specific opcodes that are a problem for your target.

Repository:
  rL LLVM

https://reviews.llvm.org/D39976