<div dir="ltr">I believe Skylake client's L2 associativity is only 4 way. While Skylake server has a much larger L2 with more associativity.<div><br></div><div>Is this something we should get from CPUID if the user does -mcpu=native?</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div>
<br><div class="gmail_quote">On Thu, Aug 24, 2017 at 2:46 AM, Tobias Grosser via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: grosser<br>
Date: Thu Aug 24 02:46:25 2017<br>
New Revision: 311647<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=311647&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project?rev=311647&view=rev</a><br>
Log:<br>
Model cache size and associativity in TargetTransformInfo<br>
<br>
Summary:<br>
We add the precise cache sizes and associativity for the following Intel<br>
architectures:<br>
<br>
- Penry<br>
- Nehalem<br>
- Westmere<br>
- Sandy Bridge<br>
- Ivy Bridge<br>
- Haswell<br>
- Broadwell<br>
- Skylake<br>
- Kabylake<br>
<br>
Polly uses since several months a performance model for BLAS computations that<br>
derives optimal cache and register tile sizes from cache and latency<br>
information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016).<br>
While bootstrapping this model, these target values have been kept in Polly.<br>
However, as our implementation is now rather mature, it seems time to teach<br>
LLVM itself about cache sizes.<br>
<br>
Interestingly, L1 and L2 cache sizes are pretty constant across<br>
micro-architectures, hence a set of architecture specific default values<br>
seems like a good start. They can be expanded to more target specific values,<br>
in case certain newer architectures require different values. For now a set<br>
of Intel architectures are provided.<br>
<br>
Just as a little teaser, for a simple gemm kernel this model allows us to<br>
improve performance from 1.2s to 0.27s. For gemm kernels with less optimal<br>
memory layouts even larger speedups can be reported.<br>
<br>
Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb<br>
<br>
Reviewed By: fhahn, asb<br>
<br>
Subscribers: lsaba, asb, pollydev, llvm-commits<br>
<br>
Differential Revision: <a href="https://reviews.llvm.org/D37051" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D37051</a><br>
<br>
Modified:<br>
llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h<br>
llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h<br>
llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp<br>
llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.cpp<br>
llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.h<br>
<br>
Modified: llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=311647&r1=311646&r2=311647&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/include/<wbr>llvm/Analysis/<wbr>TargetTransformInfo.h?rev=<wbr>311647&r1=311646&r2=311647&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h (original)<br>
+++ llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h Thu Aug 24 02:46:25 2017<br>
@@ -603,6 +603,22 @@ public:<br>
/// \return The size of a cache line in bytes.<br>
unsigned getCacheLineSize() const;<br>
<br>
+ /// The possible cache levels<br>
+ enum class CacheLevel {<br>
+ L1D, // The L1 data cache<br>
+ L2D, // The L2 data cache<br>
+<br>
+ // We currently do not model L3 caches, as their sizes differ widely between<br>
+ // microarchitectures. Also, we currently do not have a use for L3 cache<br>
+ // size modeling yet.<br>
+ };<br>
+<br>
+ /// \return The size of the cache level in bytes, if available.<br>
+ llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const;<br>
+<br>
+ /// \return The associativity of the cache level, if available.<br>
+ llvm::Optional<unsigned> getCacheAssociativity(<wbr>CacheLevel Level) const;<br>
+<br>
/// \return How much before a load we should place the prefetch instruction.<br>
/// This is currently measured in number of instructions.<br>
unsigned getPrefetchDistance() const;<br>
@@ -937,6 +953,8 @@ public:<br>
virtual bool shouldConsiderAddressTypePromo<wbr>tion(<br>
const Instruction &I, bool &<wbr>AllowPromotionWithoutCommonHea<wbr>der) = 0;<br>
virtual unsigned getCacheLineSize() = 0;<br>
+ virtual llvm::Optional<unsigned> getCacheSize(CacheLevel Level) = 0;<br>
+ virtual llvm::Optional<unsigned> getCacheAssociativity(<wbr>CacheLevel Level) = 0;<br>
virtual unsigned getPrefetchDistance() = 0;<br>
virtual unsigned getMinPrefetchStride() = 0;<br>
virtual unsigned getMaxPrefetchIterationsAhead(<wbr>) = 0;<br>
@@ -1209,6 +1227,12 @@ public:<br>
unsigned getCacheLineSize() override {<br>
return Impl.getCacheLineSize();<br>
}<br>
+ llvm::Optional<unsigned> getCacheSize(CacheLevel Level) override {<br>
+ return Impl.getCacheSize(Level);<br>
+ }<br>
+ llvm::Optional<unsigned> getCacheAssociativity(<wbr>CacheLevel Level) override {<br>
+ return Impl.getCacheAssociativity(<wbr>Level);<br>
+ }<br>
unsigned getPrefetchDistance() override { return Impl.getPrefetchDistance(); }<br>
unsigned getMinPrefetchStride() override {<br>
return Impl.getMinPrefetchStride();<br>
<br>
Modified: llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=311647&r1=311646&r2=311647&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/include/<wbr>llvm/Analysis/<wbr>TargetTransformInfoImpl.h?rev=<wbr>311647&r1=311646&r2=311647&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h (original)<br>
+++ llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h Thu Aug 24 02:46:25 2017<br>
@@ -340,6 +340,29 @@ public:<br>
<br>
unsigned getCacheLineSize() { return 0; }<br>
<br>
+ llvm::Optional<unsigned> getCacheSize(<wbr>TargetTransformInfo::<wbr>CacheLevel Level) {<br>
+ switch (Level) {<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L1D:<br>
+ LLVM_FALLTHROUGH;<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L2D:<br>
+ return llvm::Optional<unsigned>();<br>
+ }<br>
+<br>
+ llvm_unreachable("Unknown TargetTransformInfo::<wbr>CacheLevel");<br>
+ }<br>
+<br>
+ llvm::Optional<unsigned> getCacheAssociativity(<br>
+ TargetTransformInfo::<wbr>CacheLevel Level) {<br>
+ switch (Level) {<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L1D:<br>
+ LLVM_FALLTHROUGH;<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L2D:<br>
+ return llvm::Optional<unsigned>();<br>
+ }<br>
+<br>
+ llvm_unreachable("Unknown TargetTransformInfo::<wbr>CacheLevel");<br>
+ }<br>
+<br>
unsigned getPrefetchDistance() { return 0; }<br>
<br>
unsigned getMinPrefetchStride() { return 1; }<br>
<br>
Modified: llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=311647&r1=311646&r2=311647&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/<wbr>Analysis/TargetTransformInfo.<wbr>cpp?rev=311647&r1=311646&r2=<wbr>311647&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp (original)<br>
+++ llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp Thu Aug 24 02:46:25 2017<br>
@@ -321,6 +321,16 @@ unsigned TargetTransformInfo::<wbr>getCacheLi<br>
return TTIImpl->getCacheLineSize();<br>
}<br>
<br>
+llvm::Optional<unsigned> TargetTransformInfo::<wbr>getCacheSize(CacheLevel Level)<br>
+ const {<br>
+ return TTIImpl->getCacheSize(Level);<br>
+}<br>
+<br>
+llvm::Optional<unsigned> TargetTransformInfo::<wbr>getCacheAssociativity(<br>
+ CacheLevel Level) const {<br>
+ return TTIImpl-><wbr>getCacheAssociativity(Level);<br>
+}<br>
+<br>
unsigned TargetTransformInfo::<wbr>getPrefetchDistance() const {<br>
return TTIImpl->getPrefetchDistance()<wbr>;<br>
}<br>
<br>
Modified: llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=311647&r1=311646&r2=311647&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>X86/X86TargetTransformInfo.<wbr>cpp?rev=311647&r1=311646&r2=<wbr>311647&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.cpp Thu Aug 24 02:46:25 2017<br>
@@ -66,6 +66,57 @@ X86TTIImpl::getPopcntSupport(<wbr>unsigned Ty<br>
return ST->hasPOPCNT() ? TTI::PSK_FastHardware : TTI::PSK_Software;<br>
}<br>
<br>
+llvm::Optional<unsigned> X86TTIImpl::getCacheSize(<br>
+ TargetTransformInfo::<wbr>CacheLevel Level) const {<br>
+ switch (Level) {<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L1D:<br>
+ // - Penry<br>
+ // - Nehalem<br>
+ // - Westmere<br>
+ // - Sandy Bridge<br>
+ // - Ivy Bridge<br>
+ // - Haswell<br>
+ // - Broadwell<br>
+ // - Skylake<br>
+ // - Kabylake<br>
+ return 32 * 1024; // 32 KByte<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L2D:<br>
+ // - Penry<br>
+ // - Nehalem<br>
+ // - Westmere<br>
+ // - Sandy Bridge<br>
+ // - Ivy Bridge<br>
+ // - Haswell<br>
+ // - Broadwell<br>
+ // - Skylake<br>
+ // - Kabylake<br>
+ return 256 * 1024; // 256 KByte<br>
+ }<br>
+<br>
+ llvm_unreachable("Unknown TargetTransformInfo::<wbr>CacheLevel");<br>
+}<br>
+<br>
+llvm::Optional<unsigned> X86TTIImpl::<wbr>getCacheAssociativity(<br>
+ TargetTransformInfo::<wbr>CacheLevel Level) const {<br>
+ // - Penry<br>
+ // - Nehalem<br>
+ // - Westmere<br>
+ // - Sandy Bridge<br>
+ // - Ivy Bridge<br>
+ // - Haswell<br>
+ // - Broadwell<br>
+ // - Skylake<br>
+ // - Kabylake<br>
+ switch (Level) {<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L1D:<br>
+ LLVM_FALLTHROUGH;<br>
+ case TargetTransformInfo::<wbr>CacheLevel::L2D:<br>
+ return 8;<br>
+ }<br>
+<br>
+ llvm_unreachable("Unknown TargetTransformInfo::<wbr>CacheLevel");<br>
+}<br>
+<br>
unsigned X86TTIImpl::<wbr>getNumberOfRegisters(bool Vector) {<br>
if (Vector && !ST->hasSSE1())<br>
return 0;<br>
<br>
Modified: llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h?rev=311647&r1=311646&r2=311647&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>X86/X86TargetTransformInfo.h?<wbr>rev=311647&r1=311646&r2=<wbr>311647&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.h (original)<br>
+++ llvm/trunk/lib/Target/X86/<wbr>X86TargetTransformInfo.h Thu Aug 24 02:46:25 2017<br>
@@ -47,6 +47,14 @@ public:<br>
<br>
/// @}<br>
<br>
+ /// \name Cache TTI Implementation<br>
+ /// @{<br>
+ llvm::Optional<unsigned> getCacheSize(<br>
+ TargetTransformInfo::<wbr>CacheLevel Level) const;<br>
+ llvm::Optional<unsigned> getCacheAssociativity(<br>
+ TargetTransformInfo::<wbr>CacheLevel Level) const;<br>
+ /// @}<br>
+<br>
/// \name Vector TTI Implementations<br>
/// @{<br>
<br>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div>