[llvm-dev] Metadata in LLVM back-end

Wed Aug 19 13:37:38 PDT 2020

Lorenzo Casalino via llvm-dev <llvm-dev at lists.llvm.org> writes:

>>> I was imagining a per-instruction data-structure collecting metadata info
>>> related to that specific instruction, instead of having several metadata info
>>> directly embedded in each instruction.

>> Interesting.  At the IR level metadata isn't necessarily unique, though
>> it can be made so.  If multiple pieces of information were amalgamated
>> into one structure that might reduce the ability to share the in-memory
>> representation, which has a cost.
>>
> Uhm...could I ask you to elaborate a bit more on the "limitation on
> in-memory representation sharing"? It is not clear to me how this
> would cause a problem.

I just mean that at the IR level, if you have a metadata node with, say,
a string "foo bar" and another one with "foo" and put one on an
instruction and the other on another instruction, they won't share an
in-memory representation, whereas if you had separate nodes with "foo"
and "bar" and put both on a single instruction and just "foo" on another
instruction the "foo" metadata would be shared.

>> In my case using intrinsics would have to tie the intrinsic to the
>> instruction it is annotating.  This seems similar to your use-case.
>> This is straightforward to do if everything is SSA but once we've gone
>> beyond that things get a lot more complicated.  The mapping of
>> information to specific instructions really does seem like the most
>> difficult bit.

> No, intrinsics does not have to mirror existing instructions; yes,
> they can be used just to carry around specific data as arguments.
> Nonetheless, there we have our (implementation) problem: how to map
> info (e.g., intrinsics) to instruction, and viceversa?
>
> I am really curious on how would you perform it in the pre-RA phase :)

Pre-RA it's relatively easy as long as we're still in SSA.  The
intrinsic would simply take the instruction it should annotate as an
operand.  After SSA it obviously becomes more difficult.  I don't have a
lot of good answers for that right now.  The live range for the value
defined by the annotated instruction and used the intrinsic would
contain both instructions so maybe that could be used to connect them.

If the annotated instruction doesn't have an output value (like a store
on machine architectures) you would use the chain output in SelectionDAG
but there's no analogue in the MachineInstr representation.

                   -David