[PATCH] D122776: [LoopCacheAnalysis] Improve loop cache analysis results by taking memory access strides into consideration

Fri Apr 8 09:34:58 PDT 2022

bmahjour added a subscriber: etiotto.
bmahjour added a comment.

Thanks for looking into this @congzhe . As you described we don't seem to be able to distinguish between the cost of moving outer loops in nests more than 2 levels deep. The data locality paper <https://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf> tries to estimate the cost based on the estimated number of cache lines used when moving a loop into innermost level. Since the stride information (when it's less than the cache line size) is already used in RefCost functions, it makes more sense to somehow integrate it into the cost formula when it's larger than the cache line size. Treating stride as a separate component of the cost and associating it with the loop (as opposed to each reference group) makes the result of the analysis more complicated to consume and can cause ambiguity when dealing with multiple reference groups.

I think the proper way to solve this is to fold the "stride" information into the cost calculation. In computing the `LoopCost(l)` function, the paper already uses the notion of estimating the cache lines used based on the product of loop trip counts. I'd like to propose https://reviews.llvm.org/D123400 as a more desirable alternative as it tries to use the same concept to take the depth of the subscript dimensions into account when calculating each RefCost.  Comments are welcome. FYI @etiotto.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122776/new/

https://reviews.llvm.org/D122776