[llvm-dev] Metadata in LLVM back-end

Lorenzo Casalino via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 10 00:27:29 PST 2020


Le 09/11/20 à 00:30, Son Tuan VU a écrit :
> Hi,
>
> Thank you all for keeping this going. Indeed I was not aware that the 
> discussion was going on, I am really sorry for this late reply.
>
Nice to hear you again! Thank you for starting this thread ;)
> I understand Chris' point about metadata design. Either the metadata 
> becomes stale or removed (if we do not teach transformations to 
> preserve it), or we end up modifying many (if not all) transformations 
> to keep the data intact.
> Currently in the IR, I feel like the default behavior is to 
> ignore/remove the metadata, and only a limited number of 
> transformations know how to maintain and update it, which is a 
> best-effort approach.
> That being said, my initial thought was to adopt this approach to the 
> MIR, so that we can at least have a minimal mechanism to communicate 
> additional information to various transformations, or even dump it to 
> the asm/object file.
> In other words, it is the responsibility of the users who 
> introduce/use the metadata in the MIR to teach the transformations 
> they selected how to preserve their metadata. A common API to abstract 
> this would definitely help, just as combineMetadata() from 
> lib/Transforms/Utils/Local.cpp does.
>
Unfortunately, I never worked with the LLVM-IR Metadata (I almost 
focused on the back-end
and I just scratched the LLVM's middle-end), but I see your point.

Clearly, applying the needed modifications to all the back-end 
transformations/optimizations
is unfeasible and, probably, not worth it -- different users may have 
different requirements/needs
regarding a specific pass.

I like the idea of a common API to handle the MIR metadata, and let the 
end user handle
such data. Of course, if the community encounters common cases while 
handling the metadata, such
cases may be integrated with the upstream project.

Nonetheless, the main point of this thread is to preserve middle-end 
metadata down to the
back-end, right after the Instruction Selection phase. Hence, despite 
the need of the end user, a
"preserve-all" policy during the lowering stage is required, which will 
involve a bit of changes,
in particular in the DAGCombine pass.


> As for my use case, it is also security-related. However, I do not 
> consider the metadata to be a compilation "correctness" criteria: 
> metadata, by definition (from the LLVM IR), can be safely removed 
> without affecting the program's correctness.
> If possible, I would like to have more details on Lorenzo's use case 
> in order to see how metadata would interfere with program's correctness.
>
I would really like to discuss here the details, but, unfortunately, I 
am working on a publication
and, thus, I cannot disclose any detail here :(

However, with "correctness" I do not refer to "I/O correctness", but the 
preservation of a
security property expressed in the front-end (e.g., specified in the 
source-code) or in the
middle-end (e.g., specified in the LLVM-IR, for instance by a 
transformation pass).

 From a security point-of-view, removing or altering metadata does not 
interfere with the I/O
functionality of the code (although may impact on the performances), but 
may introduce
vulnerabilities.

> As for the RFC, I can definitely try to write one, but this would be 
> my first time doing so. But maybe it is better to start with Lorenzo's 
> proposal, as you have already been working on this? Please tell me if 
> you prefer me to start the RFC though.
>
It is the first time for me too, do not worry!

We could just use any other RFC as a template to get started :D

I think that a structure like the following would be fine:

   1. Background
      1.1 Motivation
      1.2 Use-cases
      1.3 Other approaches
   2. Goal(s)
   3. Requirements
   4. Drawbacks and main bottlenecks
   5. Design sketch
   6. Roadmap sketch
   7. Potential future development

It may be a bit overkill; you are warmly invited to cut/refine these points!

And...no, I still have no sketch of the RFC; sorry, I had a bit of 
workload in these
days.

Yes, you can start the write up of the RFC.

Quoting David:

   "Since you first raised the topic [...] I want to give you right of 
first refusal."


Have a nice day!

-- Lorenzo

> Thank you again for keeping this going.
>
> Sincerely,
>
> - Son
>
> On Wed, Nov 4, 2020 at 6:30 PM Lorenzo Casalino 
> <lorenzo.casalino93 at gmail.com <mailto:lorenzo.casalino93 at gmail.com>> 
> wrote:
>
>
>     Le 04/11/20 à 17:40, David Greene a écrit :
>     > Sorry about the late reply.
>     >
>     > Lorenzo Casalino <lorenzo.casalino93 at gmail.com
>     <mailto:lorenzo.casalino93 at gmail.com>> writes:
>     >
>     >>>>> - Should not impact compile time excessively (what is
>     "excessive?")
>     >>>> Probably, such estimation should be performed on
>     >>> Did something get cut off here?
>     >> Uops. Yep, I removed a paragraph, but, apparentely I forgot the
>     first
>     >> period. In any case, we should discuss about how to quantitatively
>     >> determine an acceptable upper-bound on the overhead on the
>     compilation
>     >> time and give a motivation for it. For instance, max n%
>     overhead on the
>     >> compilation time must be guaranteed, because ** list of reasons **.
>     > I am not sure how we'd arrive at such a number or
>     motivate/defend it.
>     > Do we have any sense of the impact of the existing metadata
>     > infrastructure?  If not I'm not sure we can do it for something
>     > completely new.  I think we can set a goal but we'd have to
>     revise it as
>     > we gain experience.
>     I think it is the best approach to employ :)
>     >>> Since you initially raised the topic, do you want to take the
>     lead in
>     >>> writing up a RFC?  I can certainly do it too but I want to
>     give you
>     >>> right of first refusal.  :)
>     >>>                     -David
>     >> Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
>     >> should be granted to him :) And I noticed now that he wasn't
>     included in
>     >> CC of all our mails; I hope he was able to follow our discussion
>     >> anyways. I am adding him in this mail and let us wait if he has any
>     >> critical feature or point to discuss.
>     > Fair enough!  I have recently taken on a lot more work so
>     unfortunately
>     > I can't devote a lot of time to this at the moment.  I've got to
>     clear
>     > out my pipeline first.  I'd be very happy to help review text, etc.
>     Do not worry, it is ok ;) Meanwhile we wait for any feedback/input
>     from Son,
>     I'll try to prepare a draft of RFC and publish it here.
>
>     Thank you David, and have a nice day :)
>
>     -- Lorenzo
>
>     >                  -David
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201110/795400ff/attachment.html>


More information about the llvm-dev mailing list