[PATCH 02/11] Support the new DiagnosticRemarks

Tobias Grosser tobias at grosser.es
Tue Jun 10 04:42:26 PDT 2014


On 10/06/2014 13:06, Andreas Simbuerger wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hey thanks for reviewing this. As you have noticed, the patch is in a
> very raw state so far. Before going into refining the patch, I need an
> oppinion on the message format.
> How do we want to give the clang user hints on what he can do to
> increase the size of the Scops?
>
> 1)
>   "You should remove that alias here..".
>   "Try eliminating that parameter here...".
>
> or more like a simple status dump:
> 2)
>   "This access is not affine ..."
>
> Variant (1) looks harder to get the remark right in terms of
> additional diagnostic info required (e.g., parameters that cause
> non-affinity).
>
> Variant (2) is less precise but easier.

Let's start with two. We can later add additional suggestions how to fix 
such problems.

> On 06/10/2014 08:58 AM, Tobias Grosser wrote:
>> On 09/06/2014 02:42, Andreas Simbuerger wrote:
>>> After the ScopDetection is finished, we provide Diagnostic
>>> remarks for all error entries of each valid Scops parent region.
>>>
>>> The goal is to provide developers with useful hints about
>>> opportunities to increase the size of the detected Scops in
>>> their code.
>>
>> Hi Andreas,
>>
>> very nice that you work on this.
>>
>> I have one high-level question. It seems you added for each and
>> every error a ShortMessage, but some of them contain FIXME or
>> reference LLVM-IR constructs that are not useful to the end user.
>> Would it make sense to introduce in a first patch only one
>> diagnostic for a single error where we know we can provide an
>> appropriate error message and introduce a fallback option for the
>> others that informs that other unspecified problems have occurred.
>> This would avoid ever reporting any useless/confusing information
>> (or even FIXMEs).
>>
>
> Yep, I hoped that during review we might come up with better short
> messages than the ones I came up with and useful ones for the FIXMEs ;-).
>
> Reporting something like 'Unspecified Error' seems reasonable. One
> might argue that giving the 'Generic/Unspecified error' is about as
> useful as the FIXME stuff'. I change it in the commit.

The main idea we reduce the patch size/code that needs to be discussed.
In the first step we get the infrastructure in place, in the second step 
we add most of the error messages. This has three benefits:

1) We can focus on getting the infrastructure right in the first patch,
    without being distracted by discussions about how to word the
    error messages

2) For error messages that are not user visible, we do not even need to
    define error messages

3) We can add test cases on the go.

>>> include/polly/ScopDetectionDiagnostic.h  | 160 ++++++++++++----
>>> lib/Analysis/ScopDetection.cpp           |  47 +++--
>>> lib/Analysis/ScopDetectionDiagnostic.cpp | 309
>>> ++++++++++++++++++++++++++++++-
>>
>> The nice thing about optimization remarks is that they can be
>> nicely tested. Would it be possible to add tests?
>
> It would be. But looks like a big pile of work ;-) *sigh*.

Right.

>> Specifically, if we start with just a single error, we could
>> immediately add a test case. When adding the known errors
>> one-by-one, we could then continue to add test cases.
>
> As soon as I get around to it I'll add one simple test for each
> reported error.

Right. That's why I suggested to take a step-by-step approach. We
add the infrastructure for the remark support, a fall-back error-report 
in case no EndUserMessage is provided and for a single case a good
end-user message and the corresponding test case.

We can then add refine add additional error messages + tests one by one.
(I would assume that for many, we do not even need end user messages)

Does this make sense?

>>> +  // Emit diagnostic remarks for the parents of valid Scops. +
>>> // This should help users with increasing the size of valid
>>> Scops.
>>
>> Grammar?
>>
>>> +  for (const Region *R : ValidRegions) { +    const Region
>>> *Parent = R->getParent(); +    if ((Parent != TopRegion) &&
>>> RejectLogs.count(Parent)) +      emitRejectionRemarks(F,
>>> RejectLogs.at(Parent), LI); +  }
>>
>> Are you saying we call this even if -Rpass was not specified. Did
>> you check the compile-time impact of this on a larger file? Or
>> maybe the LLVM test-suite?
>
> No, I have not tested the compile-time impact. Unfortunately we do not
> have access to the clang command line flags in opt afaik. I could
> check if the flag is enabled in opt easily but not in clang.
>
> So either we emit them all the time (1), or emit them as soon as the
> user passes the polly-detect-track-failures option (2).
>
> (1) might be bad in terms of compile-time.
> (2) is bad, if the clang user has to know that he needs
> polly-detect-track-failures in addition to -Rpass=".*"

I propose to keep them enabled as they are and watch the performance 
testers when committing. If you see regressions, we can analyse them and 
in the worst case use -polly-detect-track-faiures.

> Would be easier to have a way of checking that the remarks are
> requested by the user.

Right.

>>> +void emitRejectionRemarks(const llvm::Function &F, const
>>> RejectLog &Log, +                          LoopInfo *LI) { +
>>> LLVMContext &Ctx = F.getContext(); + +  const Region *R =
>>> Log.region(); +  Loop *L = LI->getLoopFor(R->getEntry()); + +  if
>>> (L) { +    DebugLoc DL = L->getStartLoc(); +
>>> Ctx.diagnose(DiagnosticInfoOptimizationRemark( +
>>> DEBUG_TYPE, F, DL, +        "The following errors keep this loop
>>> from being a Scop.")); +    for (RejectReasonPtr RR : Log) { +
>>> Ctx.diagnose(DiagnosticInfoOptimizationRemark( +
>>> DEBUG_TYPE, F, RR->getDebugLoc(), RR->getShortMessage())); +
>>> } +  }
>>
>> This seems to assume each region starts with a loop? What about
>> regions that contain multiple loops? Or a condition and only then a
>> loop?
>>
>> What is the rational of only printing errors if we can at the same
>> time print a surrounding loop?
>
> Mhhh. Actually that is a bit short-sighted you're right. After giving
> it a second thought it shouldn't be restricted at all und just print
> something like: "The size of the detected Scop can be increased by
> fixing the following errors."

This makes more sense.

Btw, it seems we do not report errors for regions where there is not 
even a child region detected as valid.

It may be useful to report in this case errors for the smallest region 
that contains at least one loop.

>>> std::string ReportNonAffBranch::getMessage() const { return ("Non
>>> affine branch in BB '" + BB->getName()).str() + "' with LHS: " +
>>> *LHS + " and RHS: " + *RHS; }
>>>
>>> +//===----------------------------------------------------------------------===//
>>>
>>>
>>>
> +// ReportNoBasePtr.
>>> + +std::string ReportNoBasePtr::getShortMessage() const { +
>>> return "I don't have a base pointer for this array"; +}
>>
>> This may need some rewording.
>
> Ok.

Or just leave it out for now (it should fall back to the default message)

>>> std::string ReportNoBasePtr::getMessage() const { return "No
>>> base pointer"; }
>>>
>>> +//===----------------------------------------------------------------------===//
>>>
>>>
>>>
> +// ReportUndefBasePtr.
>>> + +std::string ReportUndefBasePtr::getShortMessage() const { +
>>> return "The base pointer for this array is undefined"; +}
>>
>> This does also not seem to make sense for an end-user. Most likely
>> the compiler introduced for some reason an undef value. I think it
>> is better to admit we do not have a good explanation for this then
>> to emit a remark that makes little sense to the user. (Most likely
>> this only happens for bugpoint reduced test cases, so it is not
>> end-user relevant anyway).
>
> Alright. Or maybe something like "Could not find the array for the
> access?" ?

Let's leave it out for now. Let's first get the basic infrastructure in.

>>> std::string ReportUndefBasePtr::getMessage() const { return
>>> "Undefined base pointer"; }
>>>
>>> +//===----------------------------------------------------------------------===//
>>>
>>>
>>>
> +// ReportVariantBasePtr.
>>> + +std::string ReportVariantBasePtr::getShortMessage() const { +
>>> return "The base address of this array is not invariant inside
>>> the loop"; +}
>>
>> This is very nice. If we could now also get the name of the array
>> from the debug info, we would have a very nice end-user message.
>
> I'm not an expert on using the DebugInfo metadata, do you happen to
> know where I can find a small example on how to retreive it properly?

Again, let's leave it out for now. We can an improve on this later.

>>> +std::string ReportSimpleLoop::getShortMessage() const { +
>>> return "Try simplifying the loop (-loop-simplify)"; +}
>
> What would be a better wording?

This should never happen for end users. I would leave it out for now.

(I know some people dislike if I propose to reduce the initial patch, so 
please let me know if my 'leave out' comments are discouraging. My 
understanding is that as soon as we got the basic patch in, all these 
commits should be very simple and should not require a lot of discussions).

Cheers,
Tobias



More information about the llvm-commits mailing list