[llvm-dev] RFC: System (cache, etc.) model for LLVM

Andrea Di Biagio via llvm-dev llvm-dev at lists.llvm.org
Thu Nov 8 07:14:51 PST 2018


On Wed, Nov 7, 2018 at 10:26 PM David Greene <dag at cray.com> wrote:

> Andrea Di Biagio <andrea.dibiagio at gmail.com> writes:
>
> > Hi David,
>
> Hi Andrea!
>
> > I like your idea of providing a framework to describe the cache
> > hierarchy/prefetchers etc.
> > I can definitely think of a couple of ways to reuse that bit of
> > information in llvm-mca.
>
> Great!  I was hoping it would be useful for that.
>
> > I have a few questions about the general design.
>
> Thank you so much for your feedback.
>
> > It looks like in your model, you don't describe whether a cache is
> > inclusive or exclusive. I am not sure if that is useful for the
> > algorithms that you have in mind. However, in some processors, the LLC
> > (last level cache) is exclusive, and doesn't include lines from the
> > lower levels. You could potentially have a mix of inclusive/exclusive
> > shared/private caches; you may want to model those aspects.
>
> Yes, I can see for simulation purposes that would be useful.  It's not
> so useful in the compiler cache optimizations I have in mind because
> everything is a heuristic anyway and details like inclusivity are
> basically noise.
>
> I would certainly be possible to add an inclusive/exclusive property on
> a cache level.  I think that having the property implicitly reference
> the next level up would make sense (e.g. L3 is inclusive of L2 or L2 is
> exclusive of L1).  Then if, say, L3 is inclusive of L2 and L2 is
> inclusive of L1, one could by transitivity reason that L3 is inclusive
> of L1.  What do you think?
>

That would be nice to have. Thanks.


> > When you mention "write combining", you do that in the context of
> > streaming buffers used by nontemporal operations.
> > On x86, a write combining buffer allows temporal adjacent stores to be
> > combined together. Those stores would bypass the cache, and committed
> > together as a single store transaction. Write combining may fail for a
> > number of reasons; for example: there may be alignment/size
> > requirements; stores are not allowed to overlap; etc. Describing all
> > those constraints could be problematic (and maybe outside of the scope
> > of what you want to do). I guess, it is unclear at this stage (at
> > least to me) how much information is required in practice.
>
> So far what I've outlined is what we've found to be sufficient for the
> compiler optimizations we have, but again, for simulator modeling much
> more would be desired.  On the X86, only memory mapped as WC
> (write-combining) in the MTRRs actually uses the buffers, in addition to
> NT stores to WB (write-back) memory.  I believe most user memory is
> mapped WB so general stores won't use the write-combining buffers.
> There's an ordering issue with using write-combining and it wouldn't be
> safe to use it in general code without being very careful and inserting
> lots of fences.
>


Yeah. Mine was just a curiosity; the term "write combining" is kind of
overloaded. Depending on the context it assumes a different meaning.
I don't think it is a problem if write combining memory is outside of the
scope of this work.


> Modeling the use of MTRRs to represent different types of memory is out
> of scope for this proposal but could be added later, I think.
>
> The key thing I'm trying to model with the stream/write-combining
> buffers is the limited buffer resources to handle streams of NT
> operations.  These same resources would be used for WC-typed memory so
> they could apply to more than NT operations.  That could be imporant for
> compilers that deal with code that wants to heavily use write-combining
> (like manipulating video output, for example).
>
> > Ideally, it would be nice to have the concept of "memory type", and
> > map memory types to resources/memory consistency models. Not sure if
> > there is already a way to do that mapping, nor if it would improve
> > your existing framework. In theory, you could add the ability to
> > describe constraints/resources for memory types, and then have users
> > define how memory operations map to different types. That information
> > would then drive the algorithm/cost model that computes resource
> > allocation/schedule. However, I may be thinking too much about
> > possible use cases ;-).
>
> Yeah, that would be nice for a number of use cases.  That level of
> detail is beyond the scope of the current work but it's an interesting
> idea and I'll certainly keep it in mind as I work through this.
>
> > That being said, I like having extra information to feed to the
> > compiler (and llvm-mca :-)). Let see what other people think about it.
>
> So far it looks like positive responses with questions around API naming
> and such.  I haven't heard anything show-stopping yet.  Hopefully I'll
> get a chance to start posting patches soon.
>
>
I look forward to see your patches soon.

Cheers,
Andrea

                           -David
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181108/3191bc31/attachment.html>


More information about the llvm-dev mailing list