[LLVMdev] [RFC] Add warning capabilities in LLVM.

David Blaikie dblaikie at gmail.com
Wed Jul 24 22:20:22 PDT 2013


On Tue, Jul 23, 2013 at 4:55 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi David,
>
> Thanks for the feedbacks/clarifications, I appreciate.
> You will find my comments inlined with your mail.
>
> For a quick reference, here is the summary.
>
> ** Summary **
>
> The other considered approach would provide some callbacks for
> event/information we think they are important.
> Like Chris pointed out, some of the information is not available for
> querying, thus the call back should provide sufficient information.

Sure - the idea is we'd just flesh this out based on user-demand.
Anything we need, we plumb through. A Simple Matter of Programming.
(yeah, some might say a maintenance burden, but I'm still assuming we
don't have, nor want, many of these)

> The client of the callback could then query llvm if the callbacks does not
> provide sufficient information (and assuming more information would be
> available via querying).

Sure

>
> At the moment, I see the following use cases:
> - Optimization hints (see Hal’s idea).
> - Fatal error/warning reporting (this could be done via the first proposal:
> enum + bool + message.).

See my reply to Chris regarding the tradeoffs between enum +
classification (there's more than just error/warning - it should be an
enum in any case, even if we only support those two for now) + message
and enum + (maybe a classification) + placeholder parameters (which
could be strings, or possibly more than strings - such as DI* related
stuff which might allow us to link back to original source variables
in a frontend).

> - Stack size reporting.

The point with stack size is to indicate that this is one
(particularly common/canonical) example of when a warning that might
obviously be implemented by the LLVM warning callback could be better
served by a more general-purpose callback.

That's my understanding anyway - again, this is hardly my domain, so
take what I have with a teacup of salt - in part I'm just trying to
help the dialog along so we're all on the same page, not express the
relative value of these options.

>
> What else?
> Thoughts?
>
> Thanks again for all the feedbacks.
>
> Cheers,
>
> -Quentin
>
>
> On Jul 23, 2013, at 3:37 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> On Mon, Jul 22, 2013 at 4:17 PM, Quentin Colombet <qcolombet at apple.com>
> wrote:
>
> Hi,
>
> Compared to my previous email, I have added Hal’s idea for formatting the
> message and pull back some idea from the "querying framework”.
> Indeed, I propose to add some information in the reporting so that a
> front-end (more generally a client) can filter the diagnostics or take
> proper actions.
> See the details hereafter.
>
> On Jul 22, 2013, at 2:25 PM, Chandler Carruth <chandlerc at google.com> wrote:
>
>
> On Mon, Jul 22, 2013 at 2:21 PM, Eric Christopher <echristo at gmail.com>
> wrote:
>
>
> This is pretty much the same as what Quentin proposed (with the
> addition of the enum), isn't it?
>
>
> Pretty close yeah.
>
>
> Another thought and alternate strategy for dealing with these sorts of
> things:
>
> A much more broad set of callback machinery that allows the backend to
> communicate values or other information back to the front end that can
> then decide what to do. We can define an interface around this, but
> instead of having the backend vending diagnostics we have the callback
> take a "do something with this value" which can just be "communicate
> it back to the front end" or a diagnostic callback can be passed down
> from the front end, etc.
>
> This will probably take a bit more design to get a general framework
> set up, but imagine the usefulness of say being able to automatically
> reschedule a jitted function to a thread with a larger default stack
> size if the callback states that the thread size was N+1 where N is
> the size of the stack for a thread you've created.
>
>
>
> FWIW, *this* is what I was trying to get across. Not that it wouldn't be a
> callback-based mechanism, but that it should be a fully general mechanism
> rather than having something to do with warnings, errors, notes, etc. If a
> frontend chooses to use it to produce such diagnostics, cool, but there are
> other use cases that the same machinery should serve.
>
>
> I like the general idea.
>
> To be sure I understood the proposal, let me give an example.
>
> ** Example **
> The compiler says here is the size of the stack for Loc via a “handler”
> (“handler" in the sense whatever mechanism we come up to make such
> communication possible). Then the front-end builds the diagnostic from that
> information (or query for more if needed) or drops everything if it does not
> care about this size for instance (either it does not care at all or the
> size is small enough compared to its setting).
>
> ** Comments **
> Unless we have one handler per -kind of - use, and I would like to avoid
> that,
>
>
> I think that's, somewhat, Chandlers point (sorry, I hate to play
> Telephone here - but hope to help clarify some positions... apologies
> if this just further obfuscates/confuses). I believe the idea is some
> kind of generic callback type with a bunch of no-op-default callbacks,
> override the ones your frontend cares about ("void onStackSize(size_t
> bytes)", etc...).
>
> I see.
>
>
> Yes, we could come up with a system that doesn't require adding a new
> function call for every data that needs to be provided. Does it seem
> likely we'll have so many of these that we'll really want/need that?
>
>
> That’s a good point.
> I guess there are use cases that we may not anticipate and it would be nice
> if we do not have to modify this interface for that.
> In particular, out-of-the-tree target may want to do fancy thing without to
> push for change in the public tree.
>
> For now, I believe this is great that many people contributed their ideas
> and use cases, we can then decide what we want and what we do not want to
> address in the near future.
>
> Anyway, I agree that we should focus on what we care the most when we are
> done with collecting the use cases (are we done?).
>
>
> I think we should still provide an information on the severity of the
> thing we are reporting and what we are reporting.
> Basically:
> - Severity: Will the back-end abort after the information pass down or will
> it continue (the boolean of the previous proposal)?
>
>
> In the case of specific callbacks - that would be statically known
> (you might have a callback for some particular problem (we ran out of
> registers & can't satisfy this inline asm due to the register
> allocation of this function) - it's the very contract that that
> callback is a fatal problem). If we have a more generic callback
> mechanism, yes - we could enshrine some general properties (such as
> fatality) in the common part & leave the specifics of what kind of
> fatal problem to the 'blob’.
>
>
> Agreed.
>
>
> - Kind: What are we reporting (the enum from the previous proposal)?
>
> I also think we should be able to provide a default (formatted) message,
> such that a client that does not need to know what to do with the
> information can still print something somehow useful, especially on abort
> cases.
>
>
> Do you have some examples of fatal/abort cases we already have & how
> they're reported today? (including what kind of textual description we
> use?)
>
>
> Here are a few example of some warnings that you can find in the back end:
> - IntrinsicLowering:
> case Intrinsic::stackrestore: {
>  if (!Warned)
>
>    errs() << “WARNING: this target does not support the llvm.stack"
>
>              << (Callee->getIntrinsicID() == Intrinsic::stacksave ?
>                "save" : "restore") << " intrinsic.\n”;
> - PrologEpilogInserter:
>     errs() << "warning: Stack size limit exceeded (" << MFI->getStackSize()
>            << ") in " << Fn.getName()  << ".\n”;
>
> Actually, when I wrote this, I had Hal’s Optimization Diary in mind:
> <From Hal>
> - "*This loop* cannot be optimized because the induction variable *i* is
> unsigned, and cannot be proved not to wrap"
> - "*This loop* cannot be vectorized because the compiler cannot prove that
> memory read from *a* does not overlap with memory written to through *b*"
> - "*This loop* cannot be vectorized because the compiler cannot prove that
> it is unconditionally safe to dereference the pointer *a*.
>
> - The message string is text but a single kind of markup is allowed:
> <debug/>, for example:
>     "We cannot vectorize <debug/> because <debug/> is an unfriendly
> variable"
>   (where the first will be replaced by text derived from a DIScope and the
> second from a DIVariable).
> </Form Hal>
>
>
> Thus, it sounds a good thing to me to have a string with some markers to
> format the output plus the arguments to be used in the formatted output.
> Hal’s proposal could do the trick (although I do not know if DIDescriptor
> are the best thing to use here).
>
> ** Summary **
> I am starting to think that we should be able to cover the reporting case
> plus some querying mechanism with something like:
> void reportSomehtingToFEHandler(enum Reporting Kind, bool IsAbort, <some
> information>, const char* DefautMsg, <pointer to a list of args to format in
> the defautMsg>)
>
>
> Personally I dislike losing type safety in this kind of API ("here's a
> blob of data you must programmatically query based on a schema implied
> by the 'Kind' parameter & some documentation you read"). I'd prefer
> explicit callbacks per thing - if we're going to have to write an
> explicit structure & document the parameters to each of these
> callbacks anyway, it seems easier to document that by API. (for fatal
> cases we could have no default implementations - this would ensure
> clients would be required to update for new callbacks & not
> accidentally suppress them)
>
>
> If we want to let people do everything they want without modifying the
> existing structure, I think we need both.
> I agree that for the cases we care the most, we could specialize that
> approach with a specific call back for each cases.
>
> Also note that the initial intend was to report error/warning, thus the
> <some information> is not required.
> My point is, if we do not rely on the front-end to take specific action,
> there is no need to pass this information. Therefore, we can eliminate that
> type safety problem.
>
>
> Where <some information>  is supposed to be the class/struct/pointer to the
> relevant information for this kind. If it is not enough the FE should call
> additional APIs to get what it wants.
>
> This looks similar to the “classical” back-end report to front-end approach,
> but gives more freedom to the front-end as it can choose what to do based on
> the attached information.
> I also believe this will reduce the need to expose back-end APIs and speed
> up the process.
>
>
> Speed up the process of adding these diagnostic, possibly at the cost
> of having a more opaque/inscrutible API to data from LLVM, it seems.
>
>
> Good point and it matches what I have written above.
>
> Indeed, if we rely on the front-end to query for more information when it
> gets a reporting like this, we do not need to pass the information and we
> avoid this problem.
>
>
> However, the ability of the front-end (or client) to query the back-end is
> limited to the places where the back-end is reporting something. Also, if
> the back-end is meant to abort, the FE cannot do anything about it (e.g.,
> the stack size is not big enough for the jitted function).
> That’s why I said it cover “some" querying mechanism.
>
> ** Concerns **
> 1. Testing.
>
>
> Testing - I assume we'd have opt/llc register for all these callbacks
> & print them in some way (it doesn't need to have a "stack size is too
> small warning" it just needs to print the stack size whenever it's
> told - or maybe have some way to opt in to callback rendering) & then
> check the behavior with FileCheck as usual (perhaps print this stuff
> to stderr so it doesn't get confused with bytecode/asm under -o -).
>
> That tests LLVM's contract  - that it called the notifications.
> Testing Clang's behavior when these notifications are provided would
> either require end-to-end testing (just having Clang tests that run
> LLVM, assume it already passes the LLVM-only tests & then tests Clang
> behavior on top of that) as we do in a few places already - or have
> some kind of stub callback implementation we can point Clang to (it
> could read a file of callbacks to call). That would be nice, but going
> on past experience I don't suppose anyone would actually bother to
> implement it.
>
>
> Assuming we will always emit these reports, relying on a front-end to filter
> out what is not currently relevant (e.g., we did not set the stack size
> warning in the FE), what will happen when we test (make check) without a
> front-end?
> I am afraid we will pollute all tests or we will have some difficulty to
> test a specific reporting.
>
> 2. Regarding a completely query based approach, like Chris pointed out, I do
> not see how we can report consistent information at any given time. Also,
> Eric, coming back to your jit example, how could we prevent the back-end to
> abort if the jitted is too big for the stack?
>
>
> Eric's (originally Chandler's discussed in person) example wasn't
> about aborting compilation. The idea was you JIT a function, you get a
> callback saying "stack size 1 million bytes" and so you spin up a
> thread that has a big stack to run this function you just compiled.
>
>
> Thanks for the clarification.
>
>
> The point of the example is that a pure warning-based callback is
> 'general' for LLVM but very specific for LLVM /clients/ (LLVM as a
> library, something we should keep in mind) - all they can do is print
> it out. If we provide a more general feature for LLVM clients
> (callbacks that notify those clients about things they might be
> interested in, like the size of a function's stack) then they can
> build other features (apart from just warning) such as a JIT that
> dynamically chooses thread stack size based on the stack size of the
> functions it JITs.
>
>
> I see your point.
>
>
>
> 3. Back to the strictly reporting approach where we extend the inlineasm
> handler (the approach proposed by Bob and that I sketched a little bit
> more), now looks similar to this approach expect that the back-end chooses
> what it is relevant to report and the back-end does not need to pass down
> the information.
> The concern is how do we easily (in a robust and extendable manner) provide
> a front-end/back-end option for each warning/error?
>
>
> Thoughts?
>
> Cheers,
>
> -Quentin
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>




More information about the llvm-dev mailing list