[llvm] r311647 - Model cache size and associativity in TargetTransformInfo
Tobias Grosser via llvm-commits
llvm-commits at lists.llvm.org
Sat Aug 26 00:11:12 PDT 2017
On Fri, Aug 25, 2017, at 19:58, Craig Topper via llvm-commits wrote:
> Once https://reviews.llvm.org/D35348 goes in maybe we can detect it with
> a
> feature bit.
Great. I put myself on the observer list and will get back to this as
soon as it gets in. Thanks for the pointer.
Best,
Tobias
>
> ~Craig
>
> On Fri, Aug 25, 2017 at 10:54 AM, Craig Topper <craig.topper at gmail.com>
> wrote:
>
> > I believe there is a getCPU() method in the Subtarget. It's inherited from
> > MCSubtargetInfo.
> >
> > ~Craig
> >
> > On Fri, Aug 25, 2017 at 10:29 AM, Tobias Grosser via llvm-commits <
> > llvm-commits at lists.llvm.org> wrote:
> >
> >> On Fri, Aug 25, 2017, at 19:20, Craig Topper via llvm-commits wrote:
> >> > clang doesn't support -mtune. But we do have -march. llc calls it -mcpu.
> >> > "skylake" refers to the mobile and desktop version with 4-way
> >> > associativity. "skylake-avx512" refers to the Xeon server version.
> >>
> >> The X86TargetTransformInfo seems to only refer to instruction set
> >> properties. I can check for AVX512 && !Atom. Maybe this will give me the
> >> skylake-avx512. However, this seems rather encrypted. I should probably
> >> look for some better alternative. Any idea?
> >>
> >> // Attempt to lookup cost.
> >> if (ST->hasCDI())
> >> if (const auto *Entry = CostTableLookup(AVX512CDCostTbl, ISD, MTy))
> >> return LT.first * Entry->Cost;
> >>
> >> if (ST->hasBWI())
> >> if (const auto *Entry = CostTableLookup(AVX512BWCostTbl, ISD, MTy))
> >> return LT.first * Entry->Cost;
> >>
> >> if (ST->hasAVX512())
> >>
> >> Best,
> >> Tobias
> >>
> >> >
> >> > ~Craig
> >> >
> >> > On Fri, Aug 25, 2017 at 10:11 AM, Tobias Grosser <tobias at grosser.es>
> >> > wrote:
> >> >
> >> > > On Thu, Aug 24, 2017, at 23:39, Craig Topper via llvm-commits wrote:
> >> > > > I believe Skylake client's L2 associativity is only 4 way. While
> >> Skylake
> >> > > > server has a much larger L2 with more associativity.
> >> > > >
> >> > > > Is this something we should get from CPUID if the user does
> >> -mcpu=native?
> >> > >
> >> > > I was so glad that -- according to 7-cpu -- all have the same
> >> > > characteristics. But it seems I indeed overlooked the 4way
> >> associativity
> >> > > of skylake.
> >> > >
> >> > > I wonder what would be the best way to model this. Are there different
> >> > > mtune flags for Skylake server and client? Gcc distinguishes betweek
> >> > > skylake and ‘skylake-avx512’? Are these the two variants you talk
> >> about?
> >> > >
> >> > > Best,
> >> > > Tobias
> >> > >
> >> > > > ~Craig
> >> > > >
> >> > > > On Thu, Aug 24, 2017 at 2:46 AM, Tobias Grosser via llvm-commits <
> >> > > > llvm-commits at lists.llvm.org> wrote:
> >> > > >
> >> > > > > Author: grosser
> >> > > > > Date: Thu Aug 24 02:46:25 2017
> >> > > > > New Revision: 311647
> >> > > > >
> >> > > > > URL: http://llvm.org/viewvc/llvm-project?rev=311647&view=rev
> >> > > > > Log:
> >> > > > > Model cache size and associativity in TargetTransformInfo
> >> > > > >
> >> > > > > Summary:
> >> > > > > We add the precise cache sizes and associativity for the following
> >> > > Intel
> >> > > > > architectures:
> >> > > > >
> >> > > > > - Penry
> >> > > > > - Nehalem
> >> > > > > - Westmere
> >> > > > > - Sandy Bridge
> >> > > > > - Ivy Bridge
> >> > > > > - Haswell
> >> > > > > - Broadwell
> >> > > > > - Skylake
> >> > > > > - Kabylake
> >> > > > >
> >> > > > > Polly uses since several months a performance model for BLAS
> >> > > computations
> >> > > > > that
> >> > > > > derives optimal cache and register tile sizes from cache and
> >> latency
> >> > > > > information (based on ideas from "Analytical Modeling Is Enough
> >> for
> >> > > > > High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
> >> > > > > While bootstrapping this model, these target values have been
> >> kept in
> >> > > > > Polly.
> >> > > > > However, as our implementation is now rather mature, it seems
> >> time to
> >> > > teach
> >> > > > > LLVM itself about cache sizes.
> >> > > > >
> >> > > > > Interestingly, L1 and L2 cache sizes are pretty constant across
> >> > > > > micro-architectures, hence a set of architecture specific default
> >> > > values
> >> > > > > seems like a good start. They can be expanded to more target
> >> specific
> >> > > > > values,
> >> > > > > in case certain newer architectures require different values. For
> >> now
> >> > > a set
> >> > > > > of Intel architectures are provided.
> >> > > > >
> >> > > > > Just as a little teaser, for a simple gemm kernel this model
> >> allows us
> >> > > to
> >> > > > > improve performance from 1.2s to 0.27s. For gemm kernels with less
> >> > > optimal
> >> > > > > memory layouts even larger speedups can be reported.
> >> > > > >
> >> > > > > Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman,
> >> > > fhahn,
> >> > > > > sebpop, efriedma, asb
> >> > > > >
> >> > > > > Reviewed By: fhahn, asb
> >> > > > >
> >> > > > > Subscribers: lsaba, asb, pollydev, llvm-commits
> >> > > > >
> >> > > > > Differential Revision: https://reviews.llvm.org/D37051
> >> > > > >
> >> > > > > Modified:
> >> > > > > llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> > > > > llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h
> >> > > > > llvm/trunk/lib/Analysis/TargetTransformInfo.cpp
> >> > > > > llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
> >> > > > > llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h
> >> > > > >
> >> > > > > Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> > > > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/
> >> > > llvm/Analysis/
> >> > > > > TargetTransformInfo.h?rev=311647&r1=311646&r2=311647&view=diff
> >> > > > > ============================================================
> >> > > > > ==================
> >> > > > > --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> (original)
> >> > > > > +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Thu
> >> Aug 24
> >> > > > > 02:46:25 2017
> >> > > > > @@ -603,6 +603,22 @@ public:
> >> > > > > /// \return The size of a cache line in bytes.
> >> > > > > unsigned getCacheLineSize() const;
> >> > > > >
> >> > > > > + /// The possible cache levels
> >> > > > > + enum class CacheLevel {
> >> > > > > + L1D, // The L1 data cache
> >> > > > > + L2D, // The L2 data cache
> >> > > > > +
> >> > > > > + // We currently do not model L3 caches, as their sizes differ
> >> > > widely
> >> > > > > between
> >> > > > > + // microarchitectures. Also, we currently do not have a use
> >> for L3
> >> > > > > cache
> >> > > > > + // size modeling yet.
> >> > > > > + };
> >> > > > > +
> >> > > > > + /// \return The size of the cache level in bytes, if available.
> >> > > > > + llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const;
> >> > > > > +
> >> > > > > + /// \return The associativity of the cache level, if available.
> >> > > > > + llvm::Optional<unsigned> getCacheAssociativity(CacheLevel
> >> Level)
> >> > > const;
> >> > > > > +
> >> > > > > /// \return How much before a load we should place the prefetch
> >> > > > > instruction.
> >> > > > > /// This is currently measured in number of instructions.
> >> > > > > unsigned getPrefetchDistance() const;
> >> > > > > @@ -937,6 +953,8 @@ public:
> >> > > > > virtual bool shouldConsiderAddressTypePromotion(
> >> > > > > const Instruction &I, bool &AllowPromotionWithoutCommonHe
> >> ader)
> >> > > = 0;
> >> > > > > virtual unsigned getCacheLineSize() = 0;
> >> > > > > + virtual llvm::Optional<unsigned> getCacheSize(CacheLevel
> >> Level) = 0;
> >> > > > > + virtual llvm::Optional<unsigned> getCacheAssociativity(CacheLev
> >> el
> >> > > > > Level) = 0;
> >> > > > > virtual unsigned getPrefetchDistance() = 0;
> >> > > > > virtual unsigned getMinPrefetchStride() = 0;
> >> > > > > virtual unsigned getMaxPrefetchIterationsAhead() = 0;
> >> > > > > @@ -1209,6 +1227,12 @@ public:
> >> > > > > unsigned getCacheLineSize() override {
> >> > > > > return Impl.getCacheLineSize();
> >> > > > > }
> >> > > > > + llvm::Optional<unsigned> getCacheSize(CacheLevel Level)
> >> override {
> >> > > > > + return Impl.getCacheSize(Level);
> >> > > > > + }
> >> > > > > + llvm::Optional<unsigned> getCacheAssociativity(CacheLevel
> >> Level)
> >> > > > > override {
> >> > > > > + return Impl.getCacheAssociativity(Level);
> >> > > > > + }
> >> > > > > unsigned getPrefetchDistance() override { return
> >> > > > > Impl.getPrefetchDistance(); }
> >> > > > > unsigned getMinPrefetchStride() override {
> >> > > > > return Impl.getMinPrefetchStride();
> >> > > > >
> >> > > > > Modified: llvm/trunk/include/llvm/Analys
> >> is/TargetTransformInfoImpl.h
> >> > > > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/
> >> > > llvm/Analysis/
> >> > > > > TargetTransformInfoImpl.h?rev=311647&r1=311646&r2=311647&vie
> >> w=diff
> >> > > > > ============================================================
> >> > > > > ==================
> >> > > > > --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h
> >> > > (original)
> >> > > > > +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h
> >> Thu
> >> > > Aug 24
> >> > > > > 02:46:25 2017
> >> > > > > @@ -340,6 +340,29 @@ public:
> >> > > > >
> >> > > > > unsigned getCacheLineSize() { return 0; }
> >> > > > >
> >> > > > > + llvm::Optional<unsigned> getCacheSize(TargetTransformInfo::
> >> > > CacheLevel
> >> > > > > Level) {
> >> > > > > + switch (Level) {
> >> > > > > + case TargetTransformInfo::CacheLevel::L1D:
> >> > > > > + LLVM_FALLTHROUGH;
> >> > > > > + case TargetTransformInfo::CacheLevel::L2D:
> >> > > > > + return llvm::Optional<unsigned>();
> >> > > > > + }
> >> > > > > +
> >> > > > > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> >> > > > > + }
> >> > > > > +
> >> > > > > + llvm::Optional<unsigned> getCacheAssociativity(
> >> > > > > + TargetTransformInfo::CacheLevel Level) {
> >> > > > > + switch (Level) {
> >> > > > > + case TargetTransformInfo::CacheLevel::L1D:
> >> > > > > + LLVM_FALLTHROUGH;
> >> > > > > + case TargetTransformInfo::CacheLevel::L2D:
> >> > > > > + return llvm::Optional<unsigned>();
> >> > > > > + }
> >> > > > > +
> >> > > > > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> >> > > > > + }
> >> > > > > +
> >> > > > > unsigned getPrefetchDistance() { return 0; }
> >> > > > >
> >> > > > > unsigned getMinPrefetchStride() { return 1; }
> >> > > > >
> >> > > > > Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp
> >> > > > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
> >> > > > > Analysis/TargetTransformInfo.cpp?rev=311647&r1=311646&r2=
> >> > > 311647&view=diff
> >> > > > > ============================================================
> >> > > > > ==================
> >> > > > > --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original)
> >> > > > > +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Thu Aug 24
> >> > > 02:46:25
> >> > > > > 2017
> >> > > > > @@ -321,6 +321,16 @@ unsigned TargetTransformInfo::getCacheLi
> >> > > > > return TTIImpl->getCacheLineSize();
> >> > > > > }
> >> > > > >
> >> > > > > +llvm::Optional<unsigned> TargetTransformInfo::getCacheS
> >> ize(CacheLevel
> >> > > > > Level)
> >> > > > > + const {
> >> > > > > + return TTIImpl->getCacheSize(Level);
> >> > > > > +}
> >> > > > > +
> >> > > > > +llvm::Optional<unsigned> TargetTransformInfo::getCacheA
> >> ssociativity(
> >> > > > > + CacheLevel Level) const {
> >> > > > > + return TTIImpl->getCacheAssociativity(Level);
> >> > > > > +}
> >> > > > > +
> >> > > > > unsigned TargetTransformInfo::getPrefetchDistance() const {
> >> > > > > return TTIImpl->getPrefetchDistance();
> >> > > > > }
> >> > > > >
> >> > > > > Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
> >> > > > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/
> >> > > > > X86/X86TargetTransformInfo.cpp?rev=311647&r1=311646&r2=
> >> > > 311647&view=diff
> >> > > > > ============================================================
> >> > > > > ==================
> >> > > > > --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
> >> (original)
> >> > > > > +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Thu Aug
> >> 24
> >> > > > > 02:46:25 2017
> >> > > > > @@ -66,6 +66,57 @@ X86TTIImpl::getPopcntSupport(unsigned Ty
> >> > > > > return ST->hasPOPCNT() ? TTI::PSK_FastHardware :
> >> TTI::PSK_Software;
> >> > > > > }
> >> > > > >
> >> > > > > +llvm::Optional<unsigned> X86TTIImpl::getCacheSize(
> >> > > > > + TargetTransformInfo::CacheLevel Level) const {
> >> > > > > + switch (Level) {
> >> > > > > + case TargetTransformInfo::CacheLevel::L1D:
> >> > > > > + // - Penry
> >> > > > > + // - Nehalem
> >> > > > > + // - Westmere
> >> > > > > + // - Sandy Bridge
> >> > > > > + // - Ivy Bridge
> >> > > > > + // - Haswell
> >> > > > > + // - Broadwell
> >> > > > > + // - Skylake
> >> > > > > + // - Kabylake
> >> > > > > + return 32 * 1024; // 32 KByte
> >> > > > > + case TargetTransformInfo::CacheLevel::L2D:
> >> > > > > + // - Penry
> >> > > > > + // - Nehalem
> >> > > > > + // - Westmere
> >> > > > > + // - Sandy Bridge
> >> > > > > + // - Ivy Bridge
> >> > > > > + // - Haswell
> >> > > > > + // - Broadwell
> >> > > > > + // - Skylake
> >> > > > > + // - Kabylake
> >> > > > > + return 256 * 1024; // 256 KByte
> >> > > > > + }
> >> > > > > +
> >> > > > > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> >> > > > > +}
> >> > > > > +
> >> > > > > +llvm::Optional<unsigned> X86TTIImpl::getCacheAssociativity(
> >> > > > > + TargetTransformInfo::CacheLevel Level) const {
> >> > > > > + // - Penry
> >> > > > > + // - Nehalem
> >> > > > > + // - Westmere
> >> > > > > + // - Sandy Bridge
> >> > > > > + // - Ivy Bridge
> >> > > > > + // - Haswell
> >> > > > > + // - Broadwell
> >> > > > > + // - Skylake
> >> > > > > + // - Kabylake
> >> > > > > + switch (Level) {
> >> > > > > + case TargetTransformInfo::CacheLevel::L1D:
> >> > > > > + LLVM_FALLTHROUGH;
> >> > > > > + case TargetTransformInfo::CacheLevel::L2D:
> >> > > > > + return 8;
> >> > > > > + }
> >> > > > > +
> >> > > > > + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
> >> > > > > +}
> >> > > > > +
> >> > > > > unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) {
> >> > > > > if (Vector && !ST->hasSSE1())
> >> > > > > return 0;
> >> > > > >
> >> > > > > Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h
> >> > > > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/
> >> > > > > X86/X86TargetTransformInfo.h?rev=311647&r1=311646&r2=311647&
> >> view=diff
> >> > > > > ============================================================
> >> > > > > ==================
> >> > > > > --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h (original)
> >> > > > > +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h Thu Aug 24
> >> > > > > 02:46:25 2017
> >> > > > > @@ -47,6 +47,14 @@ public:
> >> > > > >
> >> > > > > /// @}
> >> > > > >
> >> > > > > + /// \name Cache TTI Implementation
> >> > > > > + /// @{
> >> > > > > + llvm::Optional<unsigned> getCacheSize(
> >> > > > > + TargetTransformInfo::CacheLevel Level) const;
> >> > > > > + llvm::Optional<unsigned> getCacheAssociativity(
> >> > > > > + TargetTransformInfo::CacheLevel Level) const;
> >> > > > > + /// @}
> >> > > > > +
> >> > > > > /// \name Vector TTI Implementations
> >> > > > > /// @{
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > _______________________________________________
> >> > > > > llvm-commits mailing list
> >> > > > > llvm-commits at lists.llvm.org
> >> > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> >> > > > >
> >> > > > _______________________________________________
> >> > > > llvm-commits mailing list
> >> > > > llvm-commits at lists.llvm.org
> >> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> >> > >
> >> > _______________________________________________
> >> > llvm-commits mailing list
> >> > llvm-commits at lists.llvm.org
> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> >>
> >
> >
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list