[LLVMdev] MCJit interface question

Thu Jun 4 13:15:47 PDT 2015

> Sounds good. Please let me know how your experiments go. I'm keen to improve the Orc APIs further, so your feedback would be very welcome.

LLILC is using ORC now, and it was a remarkably smooth transition / small change (https://github.com/dotnet/llilc/commit/47513add13980e7a32f9b0ec3d2da3db0911bca2).  The CoreCLR execution engine handles laziness (and symbol resolution) for us, so a straightforward application of IRCompileLayer using SimpleCompiler and ObjectLinkingLayer along the lines of the "initial" orc kaleidoscope example covered our current functionality neatly.

There were just two spots that I found myself writing some boilerplate:

One was in constructing the TargetMachine (to pass to the SimpleCompiler constructor).  With MCJit, the EngineBuilder took care of creating the TargetMachine, and the code to set up the EngineBuilder looked like this:

  std::unique_ptr<Module> M = ..;
  M->setTargetTriple(LLILC_TARGET_TRIPLE);

  EngineBuilder Builder(std::move(M)); // Builder takes ownership of module
  std::string ErrStr;
  Builder.setErrorStr(&ErrStr);
  Builder.setOptLevel(/* value depending on what the CLR EE requests */);
  Builder.setTargetOptions(/* value depending on what the CLR EE requests */);

  ExecutionEngine *NewEngine = Builder.create();

I noticed you used "EngineBuilder().selectTarget()" in the kaleidoscope sample, but I couldn't do exactly that for LLILC since we're threading the triple through the module that the builder wants to own, and also it seemed to make sense that we shouldn't need an EngineBuilder if we're not building an Engine, so I looked at what our Builder had been doing and wound up with this code to instead create the TargetMachine directly:

  std::string ErrStr;
  const llvm::Target *TheTarget =
      TargetRegistry::lookupTarget(LLILC_TARGET_TRIPLE, ErrStr);
  TargetOptions Options = /* value depending on what the CLR EE requests */;
  CodeGenOpt::Level OptLevel = /* value depending on what the CLR EE requests */;
  TargetMachine *TM = TheTarget->createTargetMachine(
      LLILC_TARGET_TRIPLE, "", "", Options, Reloc::Default, CodeModel::Default,
      OptLevel);

This tiny amount of boilerplate that I ended up with for creating the TargetMachine seems entirely reasonable, but going through the exercise and looking at EngineBuilder code (which of course has several more paths than reflected above) made me wonder how much of its logic is for building the TargetMachine and whether pulling it out into a TargetMachineBuilder would be useful to give ORC clients the support that EngineBuilder gives MCJit clients.

The second spot where I needed some code was getting a SymbolResolver to pass to the ObjectLinkingLayer.  In LLILC we actually don't need cross-module symbols resolved because the CoreCLR execution engine resolves them and gives the Jit raw addresses.  So I simply defined a NullResolver like so (I felt it was more readable this way than building an equivalent LambdaResolver):

/// \brief Stub \p SymbolResolver expecting no resolution requests
///
/// The ObjectLinkingLayer takes a SymbolResolver ctor parameter.
/// The CLR EE resolves tokens to addresses for the Jit during IL reading,
/// so no symbol resolution is actually needed at ObjLinking time.
class NullResolver : public llvm::RuntimeDyld::SymbolResolver {
public:
  llvm::RuntimeDyld::SymbolInfo findSymbol(const std::string &Name) final {
    llvm_unreachable("Reader resolves tokens directly to addresses");
  }

  llvm::RuntimeDyld::SymbolInfo
  findSymbolInLogicalDylib(const std::string &Name) final {
    llvm_unreachable("Reader resolves tokens directly to addresses");
  }
};

It occurs to me that if any other ORC clients are in the same boat, they'll need a resolver that look a lot like this, but I think perhaps we're in an unusual boat.

One general comment I have is that I wish it were more easily discoverable what the constraints are on the various template parameters in ORC code, both from the client side (e.g. realizing that SymbolResolverPtrT should be a naked/unique/etc pointer to RuntimeDyld::SymbolResolver or a derived class) and the consumer side (e.g. realizing that an ObjSetHandleT needs to be moved rather than assigned).  The comment on IRCompileLayer mentions that BaseLayerT "must implement the object layer concept", for example, but I wasn't sure what exactly constitutes the object layer concept.  I had to dig into that one, as I went on to implement an object layer to satisfy the CoreCLR's requirements on EH frame reporting that started this thread.  I figured that implementing the methods on ObjectLinkingLayer would be a good bet, but approached it by starting with just addObjectSet and adding method implementations as the compiler complained they were missing.  With that approach, I never did need to implement findSymbolIn, mapSectionAddress, or emitAndFinalize, for reasons I haven't dug in to understand.  The object layer class I produced in this way was itself mostly boilerplate:

class ReserveSpaceLayerT {
public:
  typedef LLILCJit::LoadLayerT::ObjSetHandleT ObjSetHandleT;

  ReserveSpaceLayerT(LLILCJit::LoadLayerT *Loader, llilc::EEMemoryManager *MM) {
    this->Loader = Loader;
    this->MM = MM;
  }

  template <typename ObjSetT, typename MemoryManagerPtrT, typename SymbolResolverPtrT>
  ObjSetHandleT addObjectSet(const ObjSetT &Objects,
                             MemoryManagerPtrT MemMgr,
                             SymbolResolverPtrT Resolver) {
    for (const auto& Obj : Objects) {
      MM->reserveUnwindSpace(*Obj);
    }
    return Loader->addObjectSet(Objects, MemMgr, Resolver);
 }

  void removeObjectSet(ObjSetHandleT H) {
    Loader->removeObjectSet(std::move(H));
  }

  orc::JITSymbol findSymbol(StringRef Name, bool ExportedSymbolsOnly) {
    return Loader->findSymbol(Name, ExportedSymbolsOnly);
  }

  template <typename OwningMBSet>
  void takeOwnershipOfBuffers(ObjSetHandleT H, OwningMBSet MBs) {
    Loader->takeOwnershipOfBuffers(std::move(H), std::move(MBs));
  }

private:
  LLILCJit::LoadLayerT *Loader;
  llilc::EEMemoryManager *MM;
};

I could imagine that abstracting this might make a useful utility -- what I wanted was to call MM->reserveUnwindSpace on each ObjectFile as it passes through, and otherwise forward everything on to the base layer.  Perhaps a LambdaObjectLayer that takes a function<void(const ObjectFile*)> would be appropriate?  Let me know if you'd like to see a patch along those lines; I'm happy to contribute back, but don't have much context on what would be useful for other clients.

Thanks
-Joseph

From: Lang Hames [mailto:lhames at gmail.com]
Sent: Monday, June 1, 2015 1:41 PM
To: Joseph Tremoulet
Cc: Russell Hadley; llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] MCJit interface question

Hi Russell, Joseph,

>  I'll look into moving LLILC to ORC.

Sounds good. Please let me know how your experiments go. I'm keen to improve the Orc APIs further, so your feedback would be very welcome.

Cheers,
Lang.

On Sat, May 30, 2015 at 11:14 AM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote:
Agreed, that sounds like the best plan.  I'll look into moving LLILC to ORC.

Thanks
-Joseph

From: Russell Hadley
Sent: Friday, May 29, 2015 8:13 PM
To: Lang Hames; Joseph Tremoulet
Cc: llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>
Subject: RE: [LLVMdev] MCJit interface question

Hey Joseph,

What Lang said made me wonder.  Is it the right time for us (LLILC) to move to ORC?  The long term plan was to go there but this could be our forcing function.

-R

From: llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Lang Hames
Sent: Friday, May 29, 2015 2:23 PM
To: Joseph Tremoulet
Cc: llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>
Subject: Re: [LLVMdev] MCJit interface question

Hi Joseph,

There are several reasons that a client might want to access the object before it's loaded, so a general API like #2 seems like the way to go.

To support this in MCJIT you can add this to the event listener API. Orc clients can already do this by adding a custom object-file layer.

- Lang.

On Fri, May 29, 2015 at 9:05 AM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote:
Hi,

I think I need to make a small change to the MCJit interface, and would like some feedback on what the most appropriate option would be.

I'm working on LLILC (a jit for the CoreCLR built on MCJit, which creates one module for each MSIL method, containing the main function and zero or more EH handler functions extracted from the MSIL method).  The CoreCLR requires the jit to notify it of the size of the unwind info descriptors for each function in the module before reserving the memory it will be loaded into.  So we need a call (or calls) from the Jit/loader into the MemoryManager that runs at-or-before reserveAllocationSpace, is conceptually similar to registerEHFrames in that it's reserving EH frames, but that additionally needs to separate out the per-function information.

A few options come to mind:

1.       Add a needsToReserveEHFrames callback on MemoryManager (to parallel needsToReserveAllocationSpace), and a reserveEHFrames callback (parallel to registerEHFrames) that the loader would use to notify the memory manager if needsToReserveEHFrames() is true.  This seems at a high-level the most straightforward fit for the LLILC requirement, but I don't know if for other targets it would even be possible to move the identification of EH frames (which depends on the 'LocalSections' computed in loadObjectImpl) to before calling reserveAllocationSpace.  I also don't know if that would be an undesirable "tight coupling" of RuntimeDyld with CoreCLR's particular interface. (and note that this is really two options, in that the code to separate out the per-function EH frame contribution could go in either the client memory manager or in the loader.)

2.       Add a notification similar to NotifyObjectEmitted, but which runs just before the call to Dyld.loadObject.  Something like NotifyObjectPreLoaded.  The LLILC-specific MemoryManager could use this hook to scan the object::ObjectFile and pass whatever it needs to the CoreCLR.  This seems like a decent option to me, but I don't know if it would be considered a bad loss of encapsulation to passs out the object::ObjectFile in the state where it's been 'created' but not yet 'loaded'.

3.       Similar to #2, the existing reserveAllocationSpace callback on MemoryManager could simply take an additional parameter which is the object::ObjectFile.  This would be a more minimal change than #2 in terms of how much surface area it adds to the MCJit interface, but also a more invasive change in that all MemoryManager implementations would need to be updated with the reserveAllocationSpace signature change (though they could just ignore the new parameter).

4.       We could avoid needing to crack the object file for this information altogether; MCJit could add a hook where a client could insert passes into the PassManager used in emitObject; LLILC could attach a pass that would consult the MachineModuleInfo, where this information could be stored (it's similar to what's stored in the WinEHFuncInfos hanging off the MMI today).  But adding hooks for client passes might be opening a can of worms…

My inclination would be #2 or #3, but I would love some feedback on which of the tradeoffs seem most agreeable (and/or what options I've overlooked).

Thanks
-Joseph

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150604/5406830e/attachment.html>