[llvm-dev] TargetRegistry and MC object ownership.

Wed Oct 11 11:19:48 PDT 2017

On Wed, Oct 11, 2017 at 11:12 AM Lang Hames <lhames at gmail.com> wrote:

> Hi All,
>
> While trying to put together an MC-based assembler the other day, I again
> encountered MC's non-obvious memory ownership rules. The most egregious
> example of these is MCObjectStreamer's destructor:
>
> MCObjectStreamer::~MCObjectStreamer() {
>   delete &Assembler->getBackend();
>   delete &Assembler->getEmitter();
>   delete &Assembler->getWriter();
>   delete Assembler;
> }
>
> In the depths of a fever from a head-cold, I snapped. I've been hacking MC
> to convert these raw pointers (and worse: references!) to unique_ptrs
> (apologies to people whose backbends I've broken), but I hit a big blocker
> when I get to Target in "llvm/Support/TargetRegistry.h".
>
> Target vends MC objects by calling registered ctor functions. E.g.:
>
> MCAsmBackend *createMCAsmBackend(const MCRegisterInfo &MRI,
>                                  StringRef TheTriple, StringRef CPU,
>                                  const MCTargetOptions &Options) const {
>   if (!MCAsmBackendCtorFn)
>     return nullptr;
>   return MCAsmBackendCtorFn(*this, MRI, Triple(TheTriple), CPU, Options);
> }
>
> The callee owns the resulting object so ideally we would return a
> unique_ptr<MCAsmBackend>, but to do this we'd need access to the definition
> of MCAsmBackend which we can't get without violating layering. (We need the
> definition because unique_ptr<MCAsmBackend> can notionally call
> MCAsmBackend's destructor, though we know it won't here as the whole point
> is to hand ownership back to the caller).
>

I realize your goals are somewhat more immediate, but I wouldn't mind
better understanding the layering here and what might be a good end goal to
know how much work it is, where people should spend it if they want to move
towards that goal, etc. Any ideas how this should be designed to have good
layering?

> There are three approaches I can think of to improve things:
>
> (1) Keep the raw pointers, but document *very* clearly, recommending the
> caller stash the result in a unique_ptr immediately.
>
> (2) Have the ctor functions take a unique_ptr<T>* as their first argument
> and store the result there (thanks to Dave Blaikie for this suggestion).
> Passing the unique_ptrs by pointer means we don't need a definition for T.
> Usage looks like:
>
> std::unique_ptr<MCAsmBackend> MAB;
> Target.createMCAsmBackend(&MAB, TheTriple, CPU, Options);
>

This could be a reference rather than a pointer, FWIW - and while the
callsite might look a bit wonky, no one could readily misuse it and anyone
who went looking to fix it would quickly find the comments explaining why,
or if they ignored it, would hit the layering issue we know about already.

> (3) Introduce a 'transient_unique_ptr<T>' type that never calls T's
> destructor, and whose purpose is to pass off ownership to a real unique_ptr:
>
> template <typename T>
> class transient_unique_ptr {
> public:
>   transient_unique_ptr(std::unique_ptr<T> InVal) : Val(InVal.release()) {}
>   transient_unique_ptr(transient_unique_ptr<T> &&Other) : Val(Other.Val) { Other.Val
> = nullptr; }
>   transient_unique_ptr&& operator=(transient_unique_ptr<T> &&Other) { std::swap(Val,
> Other.Val); return *this; }
>   ~transient_unique_ptr() { assert(!Val && "value should have been handed
> off"); }
>   std::unique_ptr<T> take() { auto Tmp = Val; Val = nullptr; return
> unique_ptr<T>(Val); }
> private:
>   T *Val = nullptr;
> };
>
> This type can also operate without needing a T definition. Usage looks
> like:
>
> auto MAB = Target.createMCAsmBackend(TheTriple, CPU, Options).take();
>

The risk here is that it would be easy to miss the 'take()' call (& auto
especially makes that risky - if it weren't for auto we could use
unnameable type tricks to make it obvious that someone shouldn't make a
local variable (that wouldn't help if template argument deduction were used
like: func(createMCAsmBackend(...)); ))

> Option (1) is a minimum. Option (2) and (3) are both compromises to deal
> with layering rules, but of those I think (2) probably comes closest to
> adhering to the principle of least surprise.
>

Yeah - I think people will find (2) quirky but doesn't mean building whole
new constructs (that might end up with more usage than we'd like) for these
kinds of layering problems. It's where I'd lean.

> Who has thoughts? If we can settle on an approach I'd love to press
> forward with the cleanup.
>
> Cheers,
> Lang.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171011/b97fd56a/attachment.html>