[llvm-dev] Metadata in LLVM back-end

David Greene via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 7 14:09:11 PDT 2020

Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> writes:

> The issue is this: either information is preserved across certain
> sorts of transformations or it is not.  If not, it either goes stale
> (problematic for anything that looks at it later) or is
> invalidated/removed.
> The fundamental issue in IR design is factoring the representation of
> information from the code that needs to inspect and update it.
> “Metadata” designs try to make it easy to add out of band information
> to the IR in various ways, with a goal of reducing the impact on the
> rest of the compiler.
> However, I’ve never seen them work out well.  Either the data becomes
> stale, or you end up changing a lot of the compiler to support it.
> Look at debug info metadata in LLVM for example, it has both problems
> :-).  This is why MLIR has moved to make source location information
> and attributes a first class part of the IR.

I basically agree with your analysis.  Some information is so pervasive
that it really should be a part of the IR proper.  But other information
may not be.  The kind of information I'm thinking of basically boils
down to optimization hints.  It's fine and semantically sound to drop
it, though not ideal if it can be avoided.

I see debug info as being in a quite different class.  With the -g
option we are making a promise to our users.  So using a mechanism that
by design doesn't make promises seems a poor fit.

A long long time ago in the dark ages before git and Phabricator I
submitted a patch for review that would have added comment information
to machine instructions.  It was basically a string member on every
MachineInstr.  At the time it was deemed too expensive and rightly so.
Instead I ended up adding some flag values that the AsmPrinter uses as a
hint to generate various comments.  I'm still not very happy with that
"solution" and a more general-purpose mechanism for annotating
IR/SelectionDAG/MIR objects would be quite welcome.

A generic first-class annotation construct would cover both use-cases.
If you and the wider community are open to adding first-class generic
information annotation, I'm eager to work on it!


More information about the llvm-dev mailing list