[llvm] r311647 - Model cache size and associativity in TargetTransformInfo
Tobias Grosser via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 25 10:11:23 PDT 2017
On Thu, Aug 24, 2017, at 23:39, Craig Topper via llvm-commits wrote:
> I believe Skylake client's L2 associativity is only 4 way. While Skylake
> server has a much larger L2 with more associativity.
>
> Is this something we should get from CPUID if the user does -mcpu=native?
I was so glad that -- according to 7-cpu -- all have the same
characteristics. But it seems I indeed overlooked the 4way associativity
of skylake.
I wonder what would be the best way to model this. Are there different
mtune flags for Skylake server and client? Gcc distinguishes betweek
skylake and ‘skylake-avx512’? Are these the two variants you talk about?
Best,
Tobias
> ~Craig
>
> On Thu, Aug 24, 2017 at 2:46 AM, Tobias Grosser via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
> > Author: grosser
> > Date: Thu Aug 24 02:46:25 2017
> > New Revision: 311647
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=311647&view=rev
> > Log:
> > Model cache size and associativity in TargetTransformInfo
> >
> > Summary:
> > We add the precise cache sizes and associativity for the following Intel
> > architectures:
> >
> > - Penry
> > - Nehalem
> > - Westmere
> > - Sandy Bridge
> > - Ivy Bridge
> > - Haswell
> > - Broadwell
> > - Skylake
> > - Kabylake
> >
> > Polly uses since several months a performance model for BLAS computations
> > that
> > derives optimal cache and register tile sizes from cache and latency
> > information (based on ideas from "Analytical Modeling Is Enough for
> > High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
> > While bootstrapping this model, these target values have been kept in
> > Polly.
> > However, as our implementation is now rather mature, it seems time to teach
> > LLVM itself about cache sizes.
> >
> > Interestingly, L1 and L2 cache sizes are pretty constant across
> > micro-architectures, hence a set of architecture specific default values
> > seems like a good start. They can be expanded to more target specific
> > values,
> > in case certain newer architectures require different values. For now a set
> > of Intel architectures are provided.
> >
> > Just as a little teaser, for a simple gemm kernel this model allows us to
> > improve performance from 1.2s to 0.27s. For gemm kernels with less optimal
> > memory layouts even larger speedups can be reported.
> >
> > Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn,
> > sebpop, efriedma, asb
> >
> > Reviewed By: fhahn, asb
> >
> > Subscribers: lsaba, asb, pollydev, llvm-commits
> >
> > Differential Revision: https://reviews.llvm.org/D37051
> >
> > Modified:
> > llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> > llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h
> > llvm/trunk/lib/Analysis/TargetTransformInfo.cpp
> > llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
> > llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h
> >
> > Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/
> > TargetTransformInfo.h?rev=311647&r1=311646&r2=311647&view=diff
> > ============================================================
> > ==================
> > --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
> > +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Thu Aug 24
> > 02:46:25 2017
> > @@ -603,6 +603,22 @@ public:
> > /// \return The size of a cache line in bytes.
> > unsigned getCacheLineSize() const;
> >
> > + /// The possible cache levels
> > + enum class CacheLevel {
> > + L1D, // The L1 data cache
> > + L2D, // The L2 data cache
> > +
> > + // We currently do not model L3 caches, as their sizes differ widely
> > between
> > + // microarchitectures. Also, we currently do not have a use for L3
> > cache
> > + // size modeling yet.
> > + };
> > +
> > + /// \return The size of the cache level in bytes, if available.
> > + llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const;
> > +
> > + /// \return The associativity of the cache level, if available.
> > + llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) const;
> > +
> > /// \return How much before a load we should place the prefetch
> > instruction.
> > /// This is currently measured in number of instructions.
> > unsigned getPrefetchDistance() const;
> > @@ -937,6 +953,8 @@ public:
> > virtual bool shouldConsiderAddressTypePromotion(
> > const Instruction &I, bool &AllowPromotionWithoutCommonHeader) = 0;
> > virtual unsigned getCacheLineSize() = 0;
> > + virtual llvm::Optional<unsigned> getCacheSize(CacheLevel Level) = 0;
> > + virtual llvm::Optional<unsigned> getCacheAssociativity(CacheLevel
> > Level) = 0;
> > virtual unsigned getPrefetchDistance() = 0;
> > virtual unsigned getMinPrefetchStride() = 0;
> > virtual unsigned getMaxPrefetchIterationsAhead() = 0;
> > @@ -1209,6 +1227,12 @@ public:
> > unsigned getCacheLineSize() override {
> > return Impl.getCacheLineSize();
> > }
> > + llvm::Optional<unsigned> getCacheSize(CacheLevel Level) override {
> > + return Impl.getCacheSize(Level);
> > + }
> > + llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level)
> > override {
> > + return Impl.getCacheAssociativity(Level);
> > + }
> > unsigned getPrefetchDistance() override { return
> > Impl.getPrefetchDistance(); }
> > unsigned getMinPrefetchStride() override {
> > return Impl.getMinPrefetchStride();
> >
> > Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/
> > TargetTransformInfoImpl.h?rev=311647&r1=311646&r2=311647&view=diff
> > ============================================================
> > ==================
> > --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h (original)
> > +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Thu Aug 24
> > 02:46:25 2017
> > @@ -340,6 +340,29 @@ public:
> >
> > unsigned getCacheLineSize() { return 0; }
> >
> > + llvm::Optional<unsigned> getCacheSize(TargetTransformInfo::CacheLevel
> > Level) {
> > + switch (Level) {
> > + case TargetTransformInfo::CacheLevel::L1D:
> > + LLVM_FALLTHROUGH;
> > + case TargetTransformInfo::CacheLevel::L2D:
> > + return llvm::Optional<unsigned>();
> > + }
> > +
> > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> > + }
> > +
> > + llvm::Optional<unsigned> getCacheAssociativity(
> > + TargetTransformInfo::CacheLevel Level) {
> > + switch (Level) {
> > + case TargetTransformInfo::CacheLevel::L1D:
> > + LLVM_FALLTHROUGH;
> > + case TargetTransformInfo::CacheLevel::L2D:
> > + return llvm::Optional<unsigned>();
> > + }
> > +
> > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> > + }
> > +
> > unsigned getPrefetchDistance() { return 0; }
> >
> > unsigned getMinPrefetchStride() { return 1; }
> >
> > Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
> > Analysis/TargetTransformInfo.cpp?rev=311647&r1=311646&r2=311647&view=diff
> > ============================================================
> > ==================
> > --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original)
> > +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Thu Aug 24 02:46:25
> > 2017
> > @@ -321,6 +321,16 @@ unsigned TargetTransformInfo::getCacheLi
> > return TTIImpl->getCacheLineSize();
> > }
> >
> > +llvm::Optional<unsigned> TargetTransformInfo::getCacheSize(CacheLevel
> > Level)
> > + const {
> > + return TTIImpl->getCacheSize(Level);
> > +}
> > +
> > +llvm::Optional<unsigned> TargetTransformInfo::getCacheAssociativity(
> > + CacheLevel Level) const {
> > + return TTIImpl->getCacheAssociativity(Level);
> > +}
> > +
> > unsigned TargetTransformInfo::getPrefetchDistance() const {
> > return TTIImpl->getPrefetchDistance();
> > }
> >
> > Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/
> > X86/X86TargetTransformInfo.cpp?rev=311647&r1=311646&r2=311647&view=diff
> > ============================================================
> > ==================
> > --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original)
> > +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Thu Aug 24
> > 02:46:25 2017
> > @@ -66,6 +66,57 @@ X86TTIImpl::getPopcntSupport(unsigned Ty
> > return ST->hasPOPCNT() ? TTI::PSK_FastHardware : TTI::PSK_Software;
> > }
> >
> > +llvm::Optional<unsigned> X86TTIImpl::getCacheSize(
> > + TargetTransformInfo::CacheLevel Level) const {
> > + switch (Level) {
> > + case TargetTransformInfo::CacheLevel::L1D:
> > + // - Penry
> > + // - Nehalem
> > + // - Westmere
> > + // - Sandy Bridge
> > + // - Ivy Bridge
> > + // - Haswell
> > + // - Broadwell
> > + // - Skylake
> > + // - Kabylake
> > + return 32 * 1024; // 32 KByte
> > + case TargetTransformInfo::CacheLevel::L2D:
> > + // - Penry
> > + // - Nehalem
> > + // - Westmere
> > + // - Sandy Bridge
> > + // - Ivy Bridge
> > + // - Haswell
> > + // - Broadwell
> > + // - Skylake
> > + // - Kabylake
> > + return 256 * 1024; // 256 KByte
> > + }
> > +
> > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> > +}
> > +
> > +llvm::Optional<unsigned> X86TTIImpl::getCacheAssociativity(
> > + TargetTransformInfo::CacheLevel Level) const {
> > + // - Penry
> > + // - Nehalem
> > + // - Westmere
> > + // - Sandy Bridge
> > + // - Ivy Bridge
> > + // - Haswell
> > + // - Broadwell
> > + // - Skylake
> > + // - Kabylake
> > + switch (Level) {
> > + case TargetTransformInfo::CacheLevel::L1D:
> > + LLVM_FALLTHROUGH;
> > + case TargetTransformInfo::CacheLevel::L2D:
> > + return 8;
> > + }
> > +
> > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> > +}
> > +
> > unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) {
> > if (Vector && !ST->hasSSE1())
> > return 0;
> >
> > Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/
> > X86/X86TargetTransformInfo.h?rev=311647&r1=311646&r2=311647&view=diff
> > ============================================================
> > ==================
> > --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h (original)
> > +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h Thu Aug 24
> > 02:46:25 2017
> > @@ -47,6 +47,14 @@ public:
> >
> > /// @}
> >
> > + /// \name Cache TTI Implementation
> > + /// @{
> > + llvm::Optional<unsigned> getCacheSize(
> > + TargetTransformInfo::CacheLevel Level) const;
> > + llvm::Optional<unsigned> getCacheAssociativity(
> > + TargetTransformInfo::CacheLevel Level) const;
> > + /// @}
> > +
> > /// \name Vector TTI Implementations
> > /// @{
> >
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> >
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list