[llvm-dev] Metadata in LLVM back-end

Wed Oct 21 01:49:18 PDT 2020

> Le 20 oct. 2020 à 6:37 PM, David Greene <dag at hpe.com> a écrit :
> 
> Lorenzo Casalino <lorenzo.casalino93 at gmail.com> writes:
> 
>>> - Flexible format - it should be as simple as possible to express the
>>>  desired information while minimizing changes to APIs
>> I do not want to raise a philosophical discussion (although, I would
>> find it quite interesting), but "flexible" does not necessarely mean
>> "simple".
>> 
>> We could split this requirement as:
> 
> Good idea to separate these.
> 
>> - Flexible format - the format should be expressive enough to enable
>> modelization
>>   of *virtually* any kind of information type.
>> 
>> - Simple interface - expressing information and attaching them to MIR
>> elements (e.g.,
>>   instructions) should be "easy" (what does it mean *easy*?)
> 
> I would say "easy" means:
> 
> - Utilities are available to make maintaining information as transparent
>  (automatic) as possible.
> 
> - When not automatic, it is straightforward to apply the necessary APIs
>  to keep information updated.
> 

Ok, perfect!

>>> - Preserve information by default, only drop if explicitly told (I'm
>>>  trying to capture the requirements for your use-case here and this
>>>  differs from IR-level metadata)
> 
>> What about giving to end-users the possibility to define a custom
>> defaultpolicy, as
>> well as the possibility to define different type of policies.
> 
> Possibly, though that might be overkill.  We don't want to bog this down
> so much that it doesn't make progress.  I would lean toward picking a
> policy and then incrementally adding features as needed.
> 
>> Further, we must cope with the combination of instructions: the
>> information associated to two instructions eligible for combination,
>> how are combined?
>> 
>> - Information transformation - the information associated to two
>> instruction A, B, which   are combined into an instruction C, should
>> be properly transformed according to a   user-specific policy.
>> 
>>   A default policy may be "assign both information of A and B to C"
>> (gather-all/assign-all   policy?)
> 
> Again, I would lean toward just assign both pieces of information and
> rpvode utilities to scrub the result if necessary.  If it turns out
> that other cases are common, we can add other default policies.
> 

I agree!

>>> - No bifurcation between "well-known"/"built-in" information and things
>>>  added later/locally
> 
>> May I ask you to elaborate a bit more about this point?
> 
> Sure.  The current IR metadata is bifurcated.  Some pieces of
> information are more "first-class" than others.  For example there are
> specialized metadata nodes
> (https://llvm.org/docs/LangRef.html#specialized-metadata-nodes) while
> other pieces of metadata are simple strings or numbers.
> 
> It would be simplest/easiest if metadata were handled uniformly.
> 

Ok, so this boils down to a uniform usage of the metadata.

>>> - Should not impact compile time excessively (what is "excessive?")
>> 
>> Probably, such estimation should be performed on
> 
> Did something get cut off here?

Uops. Yep, I removed a paragraph, but, apparentely I forgot the first
period. In any case, we should discuss about how to quantitatively
determine an acceptable upper-bound on the overhead on the compilation
time and give a motivation for it. For instance, max n% overhead on the
compilation time must be guaranteed, because ** list of reasons **.

Of course, first we should identify the worst-case scenario; probably
the case where all the MIR elements are decorated with metadata, and all
the API functionalities are employed?

> 
>> What about the granularity level?
>> 
>> - Granularity level - metadata information should be attachable with
>> different
>>   level of granularity:
>> 
>>   - *Coarse*: MachineFunction level
>>   - *Medium*: MachineBasicBlock level
>>   - *Fine*:   MachineInstruction level
>> 
>> Clearly, there are other degree of granularity and/or dimensions to be
>> considered
>> (e.g., LiveInterval, MIBundles, Loops, ...).
> 
> It's probably a good idea to list at least the levels of granularity we
> expect to need.  I'd start with function/block/instruction as I can
> imagine uses for all three.  I am less sure about the other levels you
> mention.  We can add more capability later if needed.
> 
>> Sorry for the long delay!
> 
> No problem!  I know I'm extremely busy as I'm sure we all are.  :)
> 
> Since you initially raised the topic, do you want to take the lead in
> writing up a RFC?  I can certainly do it too but I want to give you
> right of first refusal.  :)
>                    -David

Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
should be granted to him :) And I noticed now that he wasn't included in
CC of all our mails; I hope he was able to follow our discussion
anyways. I am adding him in this mail and let us wait if he has any
critical feature or point to discuss.

Thank you, David :)

-- Lorenzo