[llvm-dev] [RFC] Non-Temporal hints from Loop Vectorizer
Adam Nemet via llvm-dev
llvm-dev at lists.llvm.org
Tue May 3 10:21:07 PDT 2016
> On May 3, 2016, at 3:40 AM, Hahnfeld, Jonas via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hello all,
>
> I've been wondering why Clang doesn't generate non-temporal stores when
> compiling the STREAM benchmark [1] and therefore doesn't yield optimal
> results.
>
> It turned out that the Loop Vectorizer correctly vectorizes the arithmetic
> operations and also merges the loads and stores into vector operations.
> However it doesn't add the '!nontemporal' metadata which would be needed for
> maximal bandwidth on X86.
> I briefly looked into this and for non-temporal memory instructions to work,
> the memory address would have to be aligned to the vector length which
> currently isn't the case neither.
>
> To summarize the following things would be needed to give non-temporal
> hints:
> 1) Ensure correct alignment of merged vector memory instructions
> This could be implemented by executing the first (scalar) loop iterations
> until the addresses for loads and stores are aligned, similar to what already
> happens for the remainder of the loop. The larger alignment would also allow
> aligned vector instructions instead of the currently unaligned ones.
>
> 2) Give non-temporal hints when different array elements are only used once
> per loop iteration
> We probably need to analyze the different load and stores per loop iteration
> for this…
You probably also want to ensure that you stay in the loop long enough, i.e. have some sort of a dynamic-trip count check or PGO data indicating this.
You essentially want to ensure that reads after the loop were not hitting in the cache even with regular stores. (If you are writing a large area in the loop, a large percentage of those lines are already evicted by the time you exit the loop.)
Adam
>
> Any thoughts or any ongoing work that I'm missing?
>
> Thanks,
> Jonas
>
>
> [1] https://www.cs.virginia.edu/stream/
>
> --
> Jonas Hahnfeld, MATSE-Auszubildender
>
> IT Center
> Group: High Performance Computing
> Division: Computational Science and Engineering
> RWTH Aachen University
> Seffenter Weg 23
> D 52074 Aachen (Germany)
> Hahnfeld at itc.rwth-aachen.de
> www.itc.rwth-aachen.de
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list