[llvm-dev] RFC: System (cache, etc.) model for LLVM

Wed Nov 7 14:26:32 PST 2018

Andrea Di Biagio <andrea.dibiagio at gmail.com> writes:

> Hi David,

Hi Andrea!

> I like your idea of providing a framework to describe the cache
> hierarchy/prefetchers etc.
> I can definitely think of a couple of ways to reuse that bit of
> information in llvm-mca.

Great!  I was hoping it would be useful for that.

> I have a few questions about the general design.

Thank you so much for your feedback.

> It looks like in your model, you don't describe whether a cache is
> inclusive or exclusive. I am not sure if that is useful for the
> algorithms that you have in mind. However, in some processors, the LLC
> (last level cache) is exclusive, and doesn't include lines from the
> lower levels. You could potentially have a mix of inclusive/exclusive
> shared/private caches; you may want to model those aspects.

Yes, I can see for simulation purposes that would be useful.  It's not
so useful in the compiler cache optimizations I have in mind because
everything is a heuristic anyway and details like inclusivity are
basically noise.

I would certainly be possible to add an inclusive/exclusive property on
a cache level.  I think that having the property implicitly reference
the next level up would make sense (e.g. L3 is inclusive of L2 or L2 is
exclusive of L1).  Then if, say, L3 is inclusive of L2 and L2 is
inclusive of L1, one could by transitivity reason that L3 is inclusive
of L1.  What do you think?

> When you mention "write combining", you do that in the context of
> streaming buffers used by nontemporal operations.
> On x86, a write combining buffer allows temporal adjacent stores to be
> combined together. Those stores would bypass the cache, and committed
> together as a single store transaction. Write combining may fail for a
> number of reasons; for example: there may be alignment/size
> requirements; stores are not allowed to overlap; etc. Describing all
> those constraints could be problematic (and maybe outside of the scope
> of what you want to do). I guess, it is unclear at this stage (at
> least to me) how much information is required in practice.

So far what I've outlined is what we've found to be sufficient for the
compiler optimizations we have, but again, for simulator modeling much
more would be desired.  On the X86, only memory mapped as WC
(write-combining) in the MTRRs actually uses the buffers, in addition to
NT stores to WB (write-back) memory.  I believe most user memory is
mapped WB so general stores won't use the write-combining buffers.
There's an ordering issue with using write-combining and it wouldn't be
safe to use it in general code without being very careful and inserting
lots of fences.

Modeling the use of MTRRs to represent different types of memory is out
of scope for this proposal but could be added later, I think.

The key thing I'm trying to model with the stream/write-combining
buffers is the limited buffer resources to handle streams of NT
operations.  These same resources would be used for WC-typed memory so
they could apply to more than NT operations.  That could be imporant for
compilers that deal with code that wants to heavily use write-combining
(like manipulating video output, for example).

> Ideally, it would be nice to have the concept of "memory type", and
> map memory types to resources/memory consistency models. Not sure if
> there is already a way to do that mapping, nor if it would improve
> your existing framework. In theory, you could add the ability to
> describe constraints/resources for memory types, and then have users
> define how memory operations map to different types. That information
> would then drive the algorithm/cost model that computes resource
> allocation/schedule. However, I may be thinking too much about
> possible use cases ;-).

Yeah, that would be nice for a number of use cases.  That level of
detail is beyond the scope of the current work but it's an interesting
idea and I'll certainly keep it in mind as I work through this.

> That being said, I like having extra information to feed to the
> compiler (and llvm-mca :-)). Let see what other people think about it.

So far it looks like positive responses with questions around API naming
and such.  I haven't heard anything show-stopping yet.  Hopefully I'll
get a chance to start posting patches soon.

                           -David