[PATCH] D24833: [LoopDataPrefetch/AArch64] Allow selective prefetching of symbolic strided accesses

Adam Nemet via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 28 22:39:47 PDT 2016

anemet added inline comments.

Comment at: lib/Target/AArch64/AArch64Subtarget.cpp:74-75
@@ -73,3 +73,4 @@
     PrefetchDistance = 740;
+    PrefetchDegree = 1;
     MinPrefetchStride = 1024;
     MaxPrefetchIterationsAhead = 11;
bmakam wrote:
> anemet wrote:
> > So you are saying on one hand (MinPrefetchStride = 1024) that we shouldn't bother prefetching unless the stride is at least 1K but then you say (PrefetchDegree = 1) that you want to prefetch the very next cache line anytime the stride is not known at compile time.
> > 
> > I feel that there is a contradiction here.  The former suggest that you have a pretty powerful HW prefetcher, the latter that you don't and willing to speculate aggressively to compensate for it.
> > 
> > It seems that something is wrong with the model here.
> Thanks for the review, Adam.
> You are right, If the stride is unknown at compile time we speculate and prefetch the next cache line and if the stride is known we do not need to speculate so we conservatively prefetch for strides > 1K. 
> Sorry, I do not see a contradiction here. Are you suggesting to insert runtime checks to determine that the unknown stride is > 1K and only then prefetch? I just do not see a justification for the additional complexity and probably it might hurt performance due to the runtime checks.
No I am not suggesting run-time checks.

Accesses with unknown strides are *only* omitted if MinPrefetchStrides is set (since we can't compare them at compile time).  It seems to me that you don't want to set MinPrefetchStride for your target.  Have you experimented with that?

To explain the contradiction a bit more, your model says that you *have* a HW perfetcher that is able to track regular strides < 1024.

Your patch contradicts this by saying that even small regular strides are worth prefetching -- the next cache line corresponds to a stride that is way less than 1024.


More information about the llvm-commits mailing list