[LLVMdev] Memory Subsystem Representation

Tue May 3 09:20:00 PDT 2011

On Tue, May 3, 2011 at 8:40 AM, David Greene <dag at cray.com> wrote:
> For a while now we (Cray) have had some very primitive cache structure
> information encoded into our version of LLVM.  Given the more complex
> memory structures introduced by Bulldozer and various accelerators, it's
> time to do this Right (tm).
>
> So I'm looking for some feedback on a proposed design.
>
> The goal of this work is to provide Passes with useful information such
> as cache sizes, resource sharing arrangements, etc. so that they may do
> transformations to improve memory system performance.
>
> Here's what I'm thinking this might look like:
>
> - Add two new structures to the TargetMachine class: TargetMemoryInfo
>  and TargetExecutionEngineInfo.
>
> - TargetMemoryInfo will initially contain cache hierarchy information.
>  It will contain a list of CacheLevelInfo objects, each of which will
>  specify at least the total size of the cache at that level.  It may
>  also include other useful bits like associativity, inclusivity, etc.
>
> - TargetMemoryInfo could be extended with information about various
>  "special" memory regions such as local, shared, etc. memory typical on
>  accelerators.  This should tie into the address space mechanism
>  somehow.
>
> - TargetExecutionEngineInfo (probably need a better name) will contain a
>  list of ExecutionResourceInfo objects, such as threads, cores,
>  modules, sockets, etc.  For example, for a Bulldozer-based system, we
>  would have a set of cores contained in a module, a set of modules
>  contained in a socket and so on.
>
> - Each ExecutionResourceInfo object would contain a name to identify the
>  grouping ("thread," "core," etc.) along with information about the
>  number of execution resources it contains.  For example, a "core"
>  object might specify that it contains two "threads."
>
> - ExecutionResourceInfo objects would also contain links to
>  CacheLevelInfo objects to model how the various levels of cache are
>  shared.  For example, on a Bulldozer system the "core" object would
>  have a link to the L1 CacheLevelInfo object, indicating that L1 is
>  private to a "core."  A "module" object would have a link to the L2
>  CacheLevelInfo object, indicating that it is private to a "module" but
>  shared by "cores" within the "module" and so on.
>
> I don't particularly like the names TargetExecutionEngineInfo and
> ExecutionResourceInfo but couldn't come up with anything better.  Any
> ideas?
>
> Does this seem like a reasonable approach?

The names and the exact information stored don't seem like they really
need review; it's easy to change later.  Just two questions:

1. What is the expected use?  Are we talking about loop optimizations here?
2. IR-level passes don't have access to a TargetMachine; is that okay?

-Eli