[llvm-dev] Determination of statements that contain only matrix multiplication
Roman Gareev via llvm-dev
llvm-dev at lists.llvm.org
Sun May 29 04:34:51 PDT 2016
2016-05-28 19:48 GMT+05:00 4lbert C0hen <4lbert.h.c0hen at gmail.com>:
> Sorry for not responding earlier.
> On 05/20/2016 03:05 PM, Roman Gareev wrote:
>> Thank you very much for the advices! I could probably try to avoid
>> using of nonhardware prefetching in the project, if Tobias doesn’t
>> disagree with it. My understanding is that prefetching isn’t used
>> explicitly in  and, according to , in some cases 90% of the
>> turbo boost peak of the processor can be attained without it.
> Too many negations :-) I'm not sure I followed exactly what you wanted to
> say, but I understand that this is not the priority since you can get 90% of
> the performance without worrying about prefetching.
Sorry for the misunderstanding. Yes, I think that if nobody minds,
prefetching couldn’t be the priority of this project, because for some
platforms we can get 90% of the performance without worrying about it.
Furthermore, as you mentioned before, hardware prefetchers can be good
at strided accesses in single-threaded code.
>> I started to consider prefetching, because it’s used in
>> implementations of gemm micro-kernels of BLIS framework . If I’m
>> not mistaken, it’s applied to try to make sure that micro-panel Br is
>> loaded after micro-panel Ar (as required in  p. 11). For example,
>> its using helps to reduce the execution time of the attached
> Interesting. The BLIS implementation prefetches only the first cache line,
> before traversing a given interval of memory. This clearly confirms the
> implementation relies on hardware preteching to prefetch the subsequent
> lines. This makes a lot of sense.
Thank you for the explanation!
> Yet surprisingly, the BLIS implementation
> does not attempt at anticipating the fetch. It schedules the prefetch
> instruction right before the first load of a given interval.
Yes, I think that it’s interesting.
>>  - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf
>>  - http://wiki.cs.utexas.edu/rvdg/HowToOptimizeGemm
>>  -
Cheers, Roman Gareev.
More information about the llvm-dev