[llvm-dev] Metadata in LLVM back-end

Sat Oct 10 04:13:12 PDT 2020

> That's the place to start, I think.  Gather a list of requirements/use
> cases along with the challenges we've discussed.  Then it's a matter of
> engineering a solution that fulfills the requirements while hitting as
> few of the challenges as possible.  Let's start by simply gathering some
> lists.  I'll take a quick stab and you and others can add to/edit it.
>
> Requirements
> ------------
> - Convey information not readily available in existing IR constructs to
>   very late-stage codegen (after regalloc/scheduling, right through
>   asm/object emission)

I see this more as the GOAL of the RFC, rather than a requirement.

> - Flexible format - it should be as simple as possible to express the
>   desired information while minimizing changes to APIs
I do not want to raise a philosophical discussion (although, I would
find it quite interesting), but "flexible" does not necessarely mean
"simple".

We could split this requirement as:

- Flexible format - the format should be expressive enough to enable
modelization
  of *virtually* any kind of information type.

- Simple interface - expressing information and attaching them to MIR
elements (e.g.,
  instructions) should be "easy" (what does it mean *easy*?)

> - Preserve information by default, only drop if explicitly told (I'm
>   trying to capture the requirements for your use-case here and this
>   differs from IR-level metadata)
What about giving to end-users the possibility to define a custom
defaultpolicy, as
well as the possibility to define different type of policies.

Further, we must cope with the combination of instructions: the
information associated
to two instructions eligible for combination, how are combined?

- Information transformation - the information associated to two
instruction A, B, which
  are combined into an instruction C, should be properly transformed
according to a
  user-specific policy.

  A default policy may be "assign both information of A and B to C"
(gather-all/assign-all
  policy?)

> - No bifurcation between "well-known"/"built-in" information and things
>   added later/locally
May I ask you to elaborate a bit more about this point?
> - Should not impact compile time excessively (what is "excessive?")

Probably, such estimation should be performed on

What about the granularity level?

- Granularity level - metadata information should be attachable with
different
  level of granularity:

  - *Coarse*: MachineFunction level
  - *Medium*: MachineBasicBlock level
  - *Fine*:   MachineInstruction level

Clearly, there are other degree of granularity and/or dimensions to be
considered
(e.g., LiveInterval, MIBundles, Loops, ...).

> Challenges of using intrinsics and other alternatives
> -----------------------------------------------------
> - Post-SSA annotation/how to associate intrinsics with
>   instructions/registers/types
>
> - Instruction selection fallout (inhibiting folding, etc.)
>
> - Register allocation impacts (extending live ranges, etc.)
>
> - Scheduling challenges (ensuring intrinsics can be found
>   post-scheduling, etc.)
>
> - Extending existing constructs (which ones?) requires hard-coding
>   aspects of information, reducing flexibility
>
> This is currently rather weasily-worded, because I didn't want to impose
> too many restrictions right off the bat.
>
>                   -David

Sorry for the long delay!

-- Lorenzo