[LLVMdev] [RFC] Add warning capabilities in LLVM.

Tue Jul 23 15:37:06 PDT 2013

On Mon, Jul 22, 2013 at 4:17 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi,
>
> Compared to my previous email, I have added Hal’s idea for formatting the
> message and pull back some idea from the "querying framework”.
> Indeed, I propose to add some information in the reporting so that a
> front-end (more generally a client) can filter the diagnostics or take
> proper actions.
> See the details hereafter.
>
> On Jul 22, 2013, at 2:25 PM, Chandler Carruth <chandlerc at google.com> wrote:
>
>
> On Mon, Jul 22, 2013 at 2:21 PM, Eric Christopher <echristo at gmail.com>
> wrote:
>>
>> >> This is pretty much the same as what Quentin proposed (with the
>> >> addition of the enum), isn't it?
>> >>
>> >
>> > Pretty close yeah.
>> >
>>
>> Another thought and alternate strategy for dealing with these sorts of
>> things:
>>
>> A much more broad set of callback machinery that allows the backend to
>> communicate values or other information back to the front end that can
>> then decide what to do. We can define an interface around this, but
>> instead of having the backend vending diagnostics we have the callback
>> take a "do something with this value" which can just be "communicate
>> it back to the front end" or a diagnostic callback can be passed down
>> from the front end, etc.
>>
>> This will probably take a bit more design to get a general framework
>> set up, but imagine the usefulness of say being able to automatically
>> reschedule a jitted function to a thread with a larger default stack
>> size if the callback states that the thread size was N+1 where N is
>> the size of the stack for a thread you've created.
>
>
> FWIW, *this* is what I was trying to get across. Not that it wouldn't be a
> callback-based mechanism, but that it should be a fully general mechanism
> rather than having something to do with warnings, errors, notes, etc. If a
> frontend chooses to use it to produce such diagnostics, cool, but there are
> other use cases that the same machinery should serve.
>
>
> I like the general idea.
>
> To be sure I understood the proposal, let me give an example.
>
> ** Example **
> The compiler says here is the size of the stack for Loc via a “handler”
> (“handler" in the sense whatever mechanism we come up to make such
> communication possible). Then the front-end builds the diagnostic from that
> information (or query for more if needed) or drops everything if it does not
> care about this size for instance (either it does not care at all or the
> size is small enough compared to its setting).
>
> ** Comments **
> Unless we have one handler per -kind of - use, and I would like to avoid
> that,

I think that's, somewhat, Chandlers point (sorry, I hate to play
Telephone here - but hope to help clarify some positions... apologies
if this just further obfuscates/confuses). I believe the idea is some
kind of generic callback type with a bunch of no-op-default callbacks,
override the ones your frontend cares about ("void onStackSize(size_t
bytes)", etc...).

Yes, we could come up with a system that doesn't require adding a new
function call for every data that needs to be provided. Does it seem
likely we'll have so many of these that we'll really want/need that?

> I think we should still provide an information on the severity of the
> thing we are reporting and what we are reporting.
> Basically:
> - Severity: Will the back-end abort after the information pass down or will
> it continue (the boolean of the previous proposal)?

In the case of specific callbacks - that would be statically known
(you might have a callback for some particular problem (we ran out of
registers & can't satisfy this inline asm due to the register
allocation of this function) - it's the very contract that that
callback is a fatal problem). If we have a more generic callback
mechanism, yes - we could enshrine some general properties (such as
fatality) in the common part & leave the specifics of what kind of
fatal problem to the 'blob'.

> - Kind: What are we reporting (the enum from the previous proposal)?
>
> I also think we should be able to provide a default (formatted) message,
> such that a client that does not need to know what to do with the
> information can still print something somehow useful, especially on abort
> cases.

Do you have some examples of fatal/abort cases we already have & how
they're reported today? (including what kind of textual description we
use?)

> Thus, it sounds a good thing to me to have a string with some markers to
> format the output plus the arguments to be used in the formatted output.
> Hal’s proposal could do the trick (although I do not know if DIDescriptor
> are the best thing to use here).
>
> ** Summary **
> I am starting to think that we should be able to cover the reporting case
> plus some querying mechanism with something like:
> void reportSomehtingToFEHandler(enum Reporting Kind, bool IsAbort, <some
> information>, const char* DefautMsg, <pointer to a list of args to format in
> the defautMsg>)

Personally I dislike losing type safety in this kind of API ("here's a
blob of data you must programmatically query based on a schema implied
by the 'Kind' parameter & some documentation you read"). I'd prefer
explicit callbacks per thing - if we're going to have to write an
explicit structure & document the parameters to each of these
callbacks anyway, it seems easier to document that by API. (for fatal
cases we could have no default implementations - this would ensure
clients would be required to update for new callbacks & not
accidentally suppress them)

> Where <some information>  is supposed to be the class/struct/pointer to the
> relevant information for this kind. If it is not enough the FE should call
> additional APIs to get what it wants.
>
> This looks similar to the “classical” back-end report to front-end approach,
> but gives more freedom to the front-end as it can choose what to do based on
> the attached information.
> I also believe this will reduce the need to expose back-end APIs and speed
> up the process.

Speed up the process of adding these diagnostic, possibly at the cost
of having a more opaque/inscrutible API to data from LLVM, it seems.

> However, the ability of the front-end (or client) to query the back-end is
> limited to the places where the back-end is reporting something. Also, if
> the back-end is meant to abort, the FE cannot do anything about it (e.g.,
> the stack size is not big enough for the jitted function).
> That’s why I said it cover “some" querying mechanism.
>
> ** Concerns **
> 1. Testing.

Testing - I assume we'd have opt/llc register for all these callbacks
& print them in some way (it doesn't need to have a "stack size is too
small warning" it just needs to print the stack size whenever it's
told - or maybe have some way to opt in to callback rendering) & then
check the behavior with FileCheck as usual (perhaps print this stuff
to stderr so it doesn't get confused with bytecode/asm under -o -).

That tests LLVM's contract  - that it called the notifications.
Testing Clang's behavior when these notifications are provided would
either require end-to-end testing (just having Clang tests that run
LLVM, assume it already passes the LLVM-only tests & then tests Clang
behavior on top of that) as we do in a few places already - or have
some kind of stub callback implementation we can point Clang to (it
could read a file of callbacks to call). That would be nice, but going
on past experience I don't suppose anyone would actually bother to
implement it.

>
> Assuming we will always emit these reports, relying on a front-end to filter
> out what is not currently relevant (e.g., we did not set the stack size
> warning in the FE), what will happen when we test (make check) without a
> front-end?
> I am afraid we will pollute all tests or we will have some difficulty to
> test a specific reporting.
>
> 2. Regarding a completely query based approach, like Chris pointed out, I do
> not see how we can report consistent information at any given time. Also,
> Eric, coming back to your jit example, how could we prevent the back-end to
> abort if the jitted is too big for the stack?

Eric's (originally Chandler's discussed in person) example wasn't
about aborting compilation. The idea was you JIT a function, you get a
callback saying "stack size 1 million bytes" and so you spin up a
thread that has a big stack to run this function you just compiled.

The point of the example is that a pure warning-based callback is
'general' for LLVM but very specific for LLVM /clients/ (LLVM as a
library, something we should keep in mind) - all they can do is print
it out. If we provide a more general feature for LLVM clients
(callbacks that notify those clients about things they might be
interested in, like the size of a function's stack) then they can
build other features (apart from just warning) such as a JIT that
dynamically chooses thread stack size based on the stack size of the
functions it JITs.

>
> 3. Back to the strictly reporting approach where we extend the inlineasm
> handler (the approach proposed by Bob and that I sketched a little bit
> more), now looks similar to this approach expect that the back-end chooses
> what it is relevant to report and the back-end does not need to pass down
> the information.
> The concern is how do we easily (in a robust and extendable manner) provide
> a front-end/back-end option for each warning/error?
>
>
> Thoughts?
>
> Cheers,
>
> -Quentin
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>