[llvm-dev] Metadata in LLVM back-end

Wed Jul 29 00:33:23 PDT 2020

>> On Jul 27, 2020, at 10:11 AM, David Greene via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> Son Tuan VU via llvm-dev <llvm-dev at lists.llvm.org> writes:
>>
>>> Currently metadata (other than debug info) can be attached to IR
>>> instructions but disappears during DAG selection.
>>>
>>> My question is why we do not keep the metadata during code lowering and
>>> then attach to MachineInstr, just as for IR instructions? Is there any
>>> technical challenge, or is it only because nobody wants to do so?
>> I have wanted codegen metadata for a very long time so I'm interested to
>> hear the history behind this choice, and more importantly, whether
>> adding such capability would be generally acceptable to the community.
> The first questions need to be “what does it mean?”, “how does it work?”, and “what is it useful for?”.  It is hard to evaluate a proposal without that.

Hi everyone,

I'm trying to answer to each of these questions; it is likely the
answers won't be
exhaustive, but I hope they will serve as a starting point for an
interesting
proposal (from my point of view and the one of Son Tuan VU and David
Greene):

- "What does it mean?": it means to preserve specific information,
represented as
  metadata assigned to instructions, from the IR level, down to the
codegen phases.

- "How does it work?": metadata should be preserved during the several
   back-end transformations; for instance, during the lowering phase,
DAGCombine
   performs several optimization to the IR, potentially combining several
   instructions. The new instruction should, then, assigned with
metadata obtained
   as a proper combination of the original ones (e.g., a union of metadata
   information).

   It might be possible to have a dedicated data-structure for such
metadata info,
   and an instance of such structure assigned to each instruction.

- "What is it useful for?": I think it is quite context-specific; but,
  in general, it is useful when some "higher-level"
  information (e.g., that canbe discovered only before the back-end
  stage of the compiler) are required in the back-end to perform
"semantic"-related
  optimizations.

To give an (quite generic) example where such codegen metadata may be
useful: in the field
of "secure compilation", preservation of security properties during the
compilation
phases is essential; such properties are specified in the high-level
specifications of
the program, and may be expressed with IR metadata. The possibility to
keep such IR
metadata in the codegen phases may allow preservation of properties that
may be invalidated
by codegen phases.

Cheers,
-- Lorenzo

> Metadata isn’t free - it must be maintained or invalidated for it to be useful.  The details on that dramatically shape whether it can be used for any given purpose.
>
> -Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200729/874be722/attachment.html>