[PATCH] D58736: [System Model] Introduce a target system model

Wed Apr 3 14:51:05 PDT 2019

greened marked 10 inline comments as done.
greened added inline comments.

================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:735-742
+  /// \return Whether to prefetch loads.
+  bool prefetchReads() const;
+
+  /// \return Whether to prefetch stores.
+  bool prefetchWrites() const;
+
+  /// \return Whether to use read prefetches for stores.
----------------
Meinersbur wrote:
> [bikeshedding] The name should indicate that these methods just return some information. How about `isPrefetchingReads`/etc.?
Good point.  I think `shouldPrefetchReads` would better convey what this is supposed to be telling us.

================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1136
       const Instruction &I, bool &AllowPromotionWithoutCommonHeader) = 0;
-  virtual unsigned getCacheLineSize() = 0;
-  virtual llvm::Optional<unsigned> getCacheSize(CacheLevel Level) = 0;
-  virtual llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) = 0;
-  virtual unsigned getPrefetchDistance() = 0;
-  virtual unsigned getMinPrefetchStride() = 0;
-  virtual unsigned getMaxPrefetchIterationsAhead() = 0;
+  virtual unsigned getCacheLineSize() const = 0;
+  virtual llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const = 0;
----------------
Meinersbur wrote:
> It is interesting that getCacheSize/getCacheAssociativity return `llvm::Optional`s and are cache-level specific, but not getCacheLineSize
Yeah.  I assume that's because on almost every processor, the cache line size is the same thoughout the cache heirarchy so the interface was designed assuming that no cache level parameter was necessary.  If there's no cache level parameter, there's no possiblity of the user providing a non-existent cache level and thus no need for an `llvm::Optional`.

However, there have been processors in the past where cache line size varied by level and certainly when you have a heterogeneous system the cache line size will not be uniform across the system.

================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:407
+
+  unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; }

----------------
Meinersbur wrote:
> simoll wrote:
> > What is this method supposed to return?
> > The right value seems to be a property of a specific pair of a loop and a target architecture rather than just the target alone.
> It is used by LoopDataPrefetch. `TargetTransformInfo.h` contains a description:
> ```
>   /// \return The maximum number of iterations to prefetch ahead.  If the
>   /// required number of iterations is more than this number, no prefetching is
>   /// performed.
>   unsigned getMaxPrefetchIterationsAhead() const;
> ```
> I assume it was added just because the current code base already defines it.
Yes, that's exactly right.  When I originally posted the RFC I suggested a single prefetch distance in bytes.  Others chimed in and said they preferred to think in terms of loop iterations and that is indeed what `LoopDataPrefetch` does.  `LoopDataPrefetch` also drove the inclsion of `getMinPrefetchStride`.

Since I didn't want to affect how current passes work, I added the necessary inrfrastructure to have the system model specify it.  I expect that as we gain experience we may eliminate some of these interfaces.

================
Comment at: llvm/include/llvm/MC/MCSubtargetInfo.h:58-65
+  /// If caches at a particular level are of different sizes, ask the
+  /// target what to do.  By default, return zero.
+  ///
+  virtual Optional<unsigned>
+  resolveCacheSize(unsigned Level,
+                   const MCSystemModel::CacheLevelSet &Levels) const {
+    return Optional<unsigned>(0);
----------------
Meinersbur wrote:
> I don't understand the idea of the `resolve...` API. What is a target overriding it supposed to do? There are not overrides in this patch. Why not providing a default implementation of `getCacheSize` that targets can override?
This is something I don't have in our local implementation, but after posting the RFC several people said they'd like to model GPUs and big.LITTLE-style systems.  In such hybrid systems, there is no single "L2 cache," for example.  If a client asks the system model, "Give me the size of the L2 cache," what should it do?  Return the L2 of the CPU, the L2 of the GPU or something else?  That's the idea of the `resolve...` stuff.  Somebody has to decide what to do which is what these virtual APIs are suggesting.

`SubtargetInfo` might not be the right place for this, but lacking a better place for it, I just put it here.  I could remove all this for the initial changeset and just return some default with an override as you suggest.  I'm definitely open to ideas/opinions here.

================
Comment at: llvm/include/llvm/MC/MCSystemModel.h:443
+
+  static const MCSystemModel Default;
+
----------------
Meinersbur wrote:
> Could we avoid the static initializer?
Maybe?  I took cues for how the scheduler stuff is implemented and it uses similar static initializers.  I'll look into this and see if there is a better way.

================
Comment at: llvm/include/llvm/MC/MCSystemModel.h:468
+  using CacheLevelInfo = SmallVector<CacheLevelSet, 4>;
+  mutable CacheLevelInfo CacheLevels;
+
----------------
Meinersbur wrote:
> Why does it need to be `mutable`?
It doesn't.  This is leftover from a time when I thought `initCacheInfoCache` and friends could be called lazily.  I'll rework this.

================
Comment at: llvm/include/llvm/Target/TargetSoftwarePrefetchConfig.td:44-47
+  int InstructionsAhead = id.Value;    // Prefetch this many instructions ahead,
+                                       // used by prefetchers that operate in
+                                       // terms of instruction distance rather
+                                       // than bytes.
----------------
Meinersbur wrote:
> ISA instructions or µOps? Why not cycles?
`LoopDataPrefetch` thinks in terms of IR `Instructions`.  I'll clarify the comment and name.  Maybe we should reconsider how `LoopDataPrefetch` thinks about things but I'd prefer to leave that for later work.  I want to be confident we can model the way things work today before we go changing a bunch of things.

================
Comment at: llvm/include/llvm/Target/TargetSoftwarePrefetchConfig.td:48-52
+  int MaxIterationsAhead = ixd.Value;  // Don't prefetch more than this number
+                                       // of iterations ahead.  Used by
+                                       // prefetchers that operate in terms of
+                                       // instruction distance rather than
+                                       // bytes.
----------------
Meinersbur wrote:
> Isn't this algorithm-dependent, i.e. the size of the loop?
Yep.  Again, this is driven by `LoopDataPrefetch`.

================
Comment at: llvm/lib/MC/MCSubtargetInfo.cpp:144
+  if (Found == Models.end() || StringRef(Found->Key) != CPU) {
+    if (CPU != "help") // Don't error if the user asked for help.
+      errs() << "'" << CPU
----------------
Meinersbur wrote:
> I find this handling of `--help` strange, but the current `MCSubtargetInfo::getSchedModelForCPU` does the same thing.
Right.  That's what I used as guidance.

================
Comment at: llvm/lib/MC/MCSystemModel.cpp:27-48
+  MCWriteCombiningBufferInfo DefaultWCBuffers(0,
+                                              "Default Write-Combining Buffer",
+                                              0);
+
+  MCSoftwarePrefetcherConfig DefaultPrefetcherConfig(0,
+                                                     "Default Prefetcher",
+                                                     false,
----------------
Meinersbur wrote:
> where are these used?
They aren't anymore.  Will remove.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58736/new/

https://reviews.llvm.org/D58736