[llvm-dev] Metadata in LLVM back-end

Wed Jan 6 05:56:13 PST 2021

Dear Tuan,

How are you doing? Did you manage to start the draft for the RFC?

I take this opportunity to wish you all the best for this new year :)

Best regards,
Lorenzo Casalino

Le 10/11/20 à 09:27, Lorenzo Casalino a écrit :
>
>
> Le 09/11/20 à 00:30, Son Tuan VU a écrit :
>> Hi,
>>
>> Thank you all for keeping this going. Indeed I was not aware that the 
>> discussion was going on, I am really sorry for this late reply.
>>
> Nice to hear you again! Thank you for starting this thread ;)
>> I understand Chris' point about metadata design. Either the metadata 
>> becomes stale or removed (if we do not teach transformations to 
>> preserve it), or we end up modifying many (if not all) 
>> transformations to keep the data intact.
>> Currently in the IR, I feel like the default behavior is to 
>> ignore/remove the metadata, and only a limited number of 
>> transformations know how to maintain and update it, which is a 
>> best-effort approach.
>> That being said, my initial thought was to adopt this approach to the 
>> MIR, so that we can at least have a minimal mechanism to communicate 
>> additional information to various transformations, or even dump it to 
>> the asm/object file.
>> In other words, it is the responsibility of the users who 
>> introduce/use the metadata in the MIR to teach the transformations 
>> they selected how to preserve their metadata. A common API to 
>> abstract this would definitely help, just as combineMetadata() from 
>> lib/Transforms/Utils/Local.cpp does.
>>
> Unfortunately, I never worked with the LLVM-IR Metadata (I almost 
> focused on the back-end
> and I just scratched the LLVM's middle-end), but I see your point.
>
> Clearly, applying the needed modifications to all the back-end 
> transformations/optimizations
> is unfeasible and, probably, not worth it -- different users may have 
> different requirements/needs
> regarding a specific pass.
>
> I like the idea of a common API to handle the MIR metadata, and let 
> the end user handle
> such data. Of course, if the community encounters common cases while 
> handling the metadata, such
> cases may be integrated with the upstream project.
>
> Nonetheless, the main point of this thread is to preserve middle-end 
> metadata down to the
> back-end, right after the Instruction Selection phase. Hence, despite 
> the need of the end user, a
> "preserve-all" policy during the lowering stage is required, which 
> will involve a bit of changes,
> in particular in the DAGCombine pass.
>
>
>> As for my use case, it is also security-related. However, I do not 
>> consider the metadata to be a compilation "correctness" criteria: 
>> metadata, by definition (from the LLVM IR), can be safely removed 
>> without affecting the program's correctness.
>> If possible, I would like to have more details on Lorenzo's use case 
>> in order to see how metadata would interfere with program's correctness.
>>
> I would really like to discuss here the details, but, unfortunately, I 
> am working on a publication
> and, thus, I cannot disclose any detail here :(
>
> However, with "correctness" I do not refer to "I/O correctness", but 
> the preservation of a
> security property expressed in the front-end (e.g., specified in the 
> source-code) or in the
> middle-end (e.g., specified in the LLVM-IR, for instance by a 
> transformation pass).
>
> From a security point-of-view, removing or altering metadata does not 
> interfere with the I/O
> functionality of the code (although may impact on the performances), 
> but may introduce
> vulnerabilities.
>
>> As for the RFC, I can definitely try to write one, but this would be 
>> my first time doing so. But maybe it is better to start with 
>> Lorenzo's proposal, as you have already been working on this? Please 
>> tell me if you prefer me to start the RFC though.
>>
> It is the first time for me too, do not worry!
>
> We could just use any other RFC as a template to get started :D
>
> I think that a structure like the following would be fine:
>
>   1. Background
>      1.1 Motivation
>      1.2 Use-cases
>      1.3 Other approaches
>   2. Goal(s)
>   3. Requirements
>   4. Drawbacks and main bottlenecks
>   5. Design sketch
>   6. Roadmap sketch
>   7. Potential future development
>
> It may be a bit overkill; you are warmly invited to cut/refine these 
> points!
>
> And...no, I still have no sketch of the RFC; sorry, I had a bit of 
> workload in these
> days.
>
> Yes, you can start the write up of the RFC.
>
> Quoting David:
>
>   "Since you first raised the topic [...] I want to give you right of 
> first refusal."
>
>
> Have a nice day!
>
> -- Lorenzo
>
>> Thank you again for keeping this going.
>>
>> Sincerely,
>>
>> - Son
>>
>> On Wed, Nov 4, 2020 at 6:30 PM Lorenzo Casalino 
>> <lorenzo.casalino93 at gmail.com <mailto:lorenzo.casalino93 at gmail.com>> 
>> wrote:
>>
>>
>>     Le 04/11/20 à 17:40, David Greene a écrit :
>>     > Sorry about the late reply.
>>     >
>>     > Lorenzo Casalino <lorenzo.casalino93 at gmail.com
>>     <mailto:lorenzo.casalino93 at gmail.com>> writes:
>>     >
>>     >>>>> - Should not impact compile time excessively (what is
>>     "excessive?")
>>     >>>> Probably, such estimation should be performed on
>>     >>> Did something get cut off here?
>>     >> Uops. Yep, I removed a paragraph, but, apparentely I forgot
>>     the first
>>     >> period. In any case, we should discuss about how to quantitatively
>>     >> determine an acceptable upper-bound on the overhead on the
>>     compilation
>>     >> time and give a motivation for it. For instance, max n%
>>     overhead on the
>>     >> compilation time must be guaranteed, because ** list of
>>     reasons **.
>>     > I am not sure how we'd arrive at such a number or
>>     motivate/defend it.
>>     > Do we have any sense of the impact of the existing metadata
>>     > infrastructure?  If not I'm not sure we can do it for something
>>     > completely new.  I think we can set a goal but we'd have to
>>     revise it as
>>     > we gain experience.
>>     I think it is the best approach to employ :)
>>     >>> Since you initially raised the topic, do you want to take the
>>     lead in
>>     >>> writing up a RFC?  I can certainly do it too but I want to
>>     give you
>>     >>> right of first refusal.  :)
>>     >>>                     -David
>>     >> Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
>>     >> should be granted to him :) And I noticed now that he wasn't
>>     included in
>>     >> CC of all our mails; I hope he was able to follow our discussion
>>     >> anyways. I am adding him in this mail and let us wait if he
>>     has any
>>     >> critical feature or point to discuss.
>>     > Fair enough!  I have recently taken on a lot more work so
>>     unfortunately
>>     > I can't devote a lot of time to this at the moment. I've got to
>>     clear
>>     > out my pipeline first.  I'd be very happy to help review text, etc.
>>     Do not worry, it is ok ;) Meanwhile we wait for any
>>     feedback/input from Son,
>>     I'll try to prepare a draft of RFC and publish it here.
>>
>>     Thank you David, and have a nice day :)
>>
>>     -- Lorenzo
>>
>>     >                  -David
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210106/15c7c845/attachment.html>