[llvm-dev] TargetRegistry and MC object ownership.

Wed Oct 11 11:12:03 PDT 2017

Hi All,

While trying to put together an MC-based assembler the other day, I again
encountered MC's non-obvious memory ownership rules. The most egregious
example of these is MCObjectStreamer's destructor:

MCObjectStreamer::~MCObjectStreamer() {
  delete &Assembler->getBackend();
  delete &Assembler->getEmitter();
  delete &Assembler->getWriter();
  delete Assembler;
}

In the depths of a fever from a head-cold, I snapped. I've been hacking MC
to convert these raw pointers (and worse: references!) to unique_ptrs
(apologies to people whose backbends I've broken), but I hit a big blocker
when I get to Target in "llvm/Support/TargetRegistry.h".

Target vends MC objects by calling registered ctor functions. E.g.:

MCAsmBackend *createMCAsmBackend(const MCRegisterInfo &MRI,
                                 StringRef TheTriple, StringRef CPU,
                                 const MCTargetOptions &Options) const {
  if (!MCAsmBackendCtorFn)
    return nullptr;
  return MCAsmBackendCtorFn(*this, MRI, Triple(TheTriple), CPU, Options);
}

The callee owns the resulting object so ideally we would return a
unique_ptr<MCAsmBackend>, but to do this we'd need access to the definition
of MCAsmBackend which we can't get without violating layering. (We need the
definition because unique_ptr<MCAsmBackend> can notionally call
MCAsmBackend's destructor, though we know it won't here as the whole point
is to hand ownership back to the caller).

There are three approaches I can think of to improve things:

(1) Keep the raw pointers, but document *very* clearly, recommending the
caller stash the result in a unique_ptr immediately.

(2) Have the ctor functions take a unique_ptr<T>* as their first argument
and store the result there (thanks to Dave Blaikie for this suggestion).
Passing the unique_ptrs by pointer means we don't need a definition for T.
Usage looks like:

std::unique_ptr<MCAsmBackend> MAB;
Target.createMCAsmBackend(&MAB, TheTriple, CPU, Options);

(3) Introduce a 'transient_unique_ptr<T>' type that never calls T's
destructor, and whose purpose is to pass off ownership to a real unique_ptr:

template <typename T>
class transient_unique_ptr {
public:
  transient_unique_ptr(std::unique_ptr<T> InVal) : Val(InVal.release()) {}
  transient_unique_ptr(transient_unique_ptr<T> &&Other) :
Val(Other.Val) { Other.Val
= nullptr; }
  transient_unique_ptr&& operator=(transient_unique_ptr<T> &&Other) {
std::swap(Val,
Other.Val); return *this; }
  ~transient_unique_ptr() { assert(!Val && "value should have been handed
off"); }
  std::unique_ptr<T> take() { auto Tmp = Val; Val = nullptr; return
unique_ptr<T>(Val); }
private:
  T *Val = nullptr;
};

This type can also operate without needing a T definition. Usage looks like:

auto MAB = Target.createMCAsmBackend(TheTriple, CPU, Options).take();

Option (1) is a minimum. Option (2) and (3) are both compromises to deal
with layering rules, but of those I think (2) probably comes closest to
adhering to the principle of least surprise.

Who has thoughts? If we can settle on an approach I'd love to press forward
with the cleanup.

Cheers,
Lang.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171011/6af587e5/attachment.html>