[PATCH 02/11] Support the new DiagnosticRemarks

Tue Jun 10 04:52:15 PDT 2014

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/10/2014 01:42 PM, Tobias Grosser wrote:
> On 10/06/2014 13:06, Andreas Simbuerger wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> Hey thanks for reviewing this. As you have noticed, the patch is
>> in a very raw state so far. Before going into refining the patch,
>> I need an oppinion on the message format. How do we want to give
>> the clang user hints on what he can do to increase the size of
>> the Scops?
>> 
>> 1) "You should remove that alias here..". "Try eliminating that
>> parameter here...".
>> 
>> or more like a simple status dump: 2) "This access is not affine
>> ..."
>> 
>> Variant (1) looks harder to get the remark right in terms of 
>> additional diagnostic info required (e.g., parameters that cause 
>> non-affinity).
>> 
>> Variant (2) is less precise but easier.
> 
> Let's start with two. We can later add additional suggestions how
> to fix such problems.
> 
>> On 06/10/2014 08:58 AM, Tobias Grosser wrote:
>>> On 09/06/2014 02:42, Andreas Simbuerger wrote:
>>>> After the ScopDetection is finished, we provide Diagnostic 
>>>> remarks for all error entries of each valid Scops parent
>>>> region.
>>>> 
>>>> The goal is to provide developers with useful hints about 
>>>> opportunities to increase the size of the detected Scops in 
>>>> their code.
>>> 
>>> Hi Andreas,
>>> 
>>> very nice that you work on this.
>>> 
>>> I have one high-level question. It seems you added for each
>>> and every error a ShortMessage, but some of them contain FIXME
>>> or reference LLVM-IR constructs that are not useful to the end
>>> user. Would it make sense to introduce in a first patch only
>>> one diagnostic for a single error where we know we can provide
>>> an appropriate error message and introduce a fallback option
>>> for the others that informs that other unspecified problems
>>> have occurred. This would avoid ever reporting any
>>> useless/confusing information (or even FIXMEs).
>>> 
>> 
>> Yep, I hoped that during review we might come up with better
>> short messages than the ones I came up with and useful ones for
>> the FIXMEs ;-).
>> 
>> Reporting something like 'Unspecified Error' seems reasonable.
>> One might argue that giving the 'Generic/Unspecified error' is
>> about as useful as the FIXME stuff'. I change it in the commit.
> 
> The main idea we reduce the patch size/code that needs to be
> discussed. In the first step we get the infrastructure in place, in
> the second step we add most of the error messages. This has three
> benefits:
> 
> 1) We can focus on getting the infrastructure right in the first
> patch, without being distracted by discussions about how to word
> the error messages
> 
> 2) For error messages that are not user visible, we do not even
> need to define error messages
> 
> 3) We can add test cases on the go.
> 
>>>> include/polly/ScopDetectionDiagnostic.h  | 160
>>>> ++++++++++++---- lib/Analysis/ScopDetection.cpp           |
>>>> 47 +++-- lib/Analysis/ScopDetectionDiagnostic.cpp | 309 
>>>> ++++++++++++++++++++++++++++++-
>>> 
>>> The nice thing about optimization remarks is that they can be 
>>> nicely tested. Would it be possible to add tests?
>> 
>> It would be. But looks like a big pile of work ;-) *sigh*.
> 
> Right.
> 
>>> Specifically, if we start with just a single error, we could 
>>> immediately add a test case. When adding the known errors 
>>> one-by-one, we could then continue to add test cases.
>> 
>> As soon as I get around to it I'll add one simple test for each 
>> reported error.
> 
> Right. That's why I suggested to take a step-by-step approach. We 
> add the infrastructure for the remark support, a fall-back
> error-report in case no EndUserMessage is provided and for a single
> case a good end-user message and the corresponding test case.
> 
> We can then add refine add additional error messages + tests one by
> one. (I would assume that for many, we do not even need end user
> messages)
> 
> Does this make sense?
> 
>>>> +  // Emit diagnostic remarks for the parents of valid Scops.
>>>> + // This should help users with increasing the size of
>>>> valid Scops.
>>> 
>>> Grammar?
>>> 
>>>> +  for (const Region *R : ValidRegions) { +    const Region 
>>>> *Parent = R->getParent(); +    if ((Parent != TopRegion) && 
>>>> RejectLogs.count(Parent)) +      emitRejectionRemarks(F, 
>>>> RejectLogs.at(Parent), LI); +  }
>>> 
>>> Are you saying we call this even if -Rpass was not specified.
>>> Did you check the compile-time impact of this on a larger file?
>>> Or maybe the LLVM test-suite?
>> 
>> No, I have not tested the compile-time impact. Unfortunately we
>> do not have access to the clang command line flags in opt afaik.
>> I could check if the flag is enabled in opt easily but not in
>> clang.
>> 
>> So either we emit them all the time (1), or emit them as soon as
>> the user passes the polly-detect-track-failures option (2).
>> 
>> (1) might be bad in terms of compile-time. (2) is bad, if the
>> clang user has to know that he needs polly-detect-track-failures
>> in addition to -Rpass=".*"
> 
> I propose to keep them enabled as they are and watch the
> performance testers when committing. If you see regressions, we can
> analyse them and in the worst case use
> -polly-detect-track-faiures.
> 
>> Would be easier to have a way of checking that the remarks are 
>> requested by the user.
> 
> Right.
> 
>>>> +void emitRejectionRemarks(const llvm::Function &F, const 
>>>> RejectLog &Log, +                          LoopInfo *LI) { + 
>>>> LLVMContext &Ctx = F.getContext(); + +  const Region *R = 
>>>> Log.region(); +  Loop *L = LI->getLoopFor(R->getEntry()); + +
>>>> if (L) { +    DebugLoc DL = L->getStartLoc(); + 
>>>> Ctx.diagnose(DiagnosticInfoOptimizationRemark( + DEBUG_TYPE,
>>>> F, DL, +        "The following errors keep this loop from
>>>> being a Scop.")); +    for (RejectReasonPtr RR : Log) { + 
>>>> Ctx.diagnose(DiagnosticInfoOptimizationRemark( + DEBUG_TYPE,
>>>> F, RR->getDebugLoc(), RR->getShortMessage())); + } +  }
>>> 
>>> This seems to assume each region starts with a loop? What
>>> about regions that contain multiple loops? Or a condition and
>>> only then a loop?
>>> 
>>> What is the rational of only printing errors if we can at the
>>> same time print a surrounding loop?
>> 
>> Mhhh. Actually that is a bit short-sighted you're right. After
>> giving it a second thought it shouldn't be restricted at all und
>> just print something like: "The size of the detected Scop can be
>> increased by fixing the following errors."
> 
> This makes more sense.
> 
> Btw, it seems we do not report errors for regions where there is
> not even a child region detected as valid.
> 
> It may be useful to report in this case errors for the smallest
> region that contains at least one loop.

Yep, the main idea behind it is that it might get a bit overwhelming
as soon as we process larger files and spam the user with errors about
each leaf in the region tree.

We could mask them in the default output? By default we use the
DEBUG_TYPE string to mask the diagnostics, but we could add something
like DEBUG_TYPE-all or DEBUG_TYPE-verbose for those
extra child regions. So the user can specify:
- -Rpass="polly-detect*" to get everything.

> 
>>>> std::string ReportNonAffBranch::getMel, I just split the
>>>> patch. I don't have to do a lot of work there, git ressage()
>>>> const { return ("Non affine branch in BB '" +
>>>> BB->getName()).str() + "' with LHS: " + *LHS + " and RHS: " +
>>>> *RHS; }
>>>> 
>>>> +//===----------------------------------------------------------------------===//
>>>>
>>>>
>>>>
>>>>
>>
>>>> 
+// ReportNoBasePtr.
>>>> + +std::string ReportNoBasePtr::getShortMessage() const { + 
>>>> return "I don't have a base pointer for this array"; +}
>>> 
>>> This may need some rewording.
>> 
>> Ok.
> 
> Or just leave it out for now (it should fall back to the default
> message)
> 
>>>> std::string ReportNoBasePtr::getMessage() const { return "No 
>>>> base pointer"; }
>>>> 
>>>> +//===----------------------------------------------------------------------===//
>>>>
>>>>
>>>>
>>>>
>>
>>>> 
+// ReportUndefBasePtr.
>>>> + +std::string ReportUndefBasePtr::getShortMessage() const {
>>>> + return "The base pointer for this array is undefined"; +}
>>> 
>>> This does also not seem to make sense for an end-user. Most
>>> likely the compiler introduced for some reason an undef value.
>>> I think it is better to admit we do not have a good explanation
>>> for this then to emit a remark that makes little sense to the
>>> user. (Most likely this only happens for bugpoint reduced test
>>> cases, so it is not end-user relevant anyway).
>> 
>> Alright. Or maybe something like "Could not find the array for
>> the access?" ?
> 
> Let's leave it out for now. Let's first get the basic
> infrastructure in.
> 
>>>> std::string ReportUndefBasePtr::getMessage() const { return 
>>>> "Undefined base pointer"; }
>>>> 
>>>> +//===----------------------------------------------------------------------===//
>>>>
>>>>
>>>>
>>>>
>>
>>>> 
+// ReportVariantBasePtr.
>>>> + +std::string ReportVariantBasePtr::getShortMessage() const
>>>> { + return "The base address of this array is not invariant
>>>> inside the loop"; +}
>>> 
>>> This is very nice. If we could now also get the name of the
>>> array from the debug info, we would have a very nice end-user
>>> message.
>> 
>> I'm not an expert on using the DebugInfo metadata, do you happen
>> to know where I can find a small example on how to retreive it
>> properly?
> 
> Again, let's leave it out for now. We can an improve on this
> later.
> 

I agree to all the leaving out above.

>>>> +std::string ReportSimpleLoop::getShortMessage() const { + 
>>>> return "Try simplifying the loop (-loop-simplify)"; +}
>> 
>> What would be a better wording?
> 
> This should never happen for end users. I would leave it out for
> now.
> 
> (I know some people dislike if I propose to reduce the initial
> patch, so please let me know if my 'leave out' comments are
> discouraging. My understanding is that as soon as we got the basic
> patch in, all these commits should be very simple and should not
> require a lot of discussions).
> 

Nah no problem at all, I just split the patch. I don't have to do a
lot of work there, git rebase does the hard work ;-).

> Cheers, Tobias
> 

- -- 
Andreas Simbürger
University of Passau - Programming Group
http://www.infosun.fim.uni-passau.de/cl/staff/simbuerger/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJTlvFvAAoJELS7a/6Ti2HYp6UIAM+MeCmHfIAtqs3TpH6l9CyA
gQF9WB8FzXTH0qPdsmB8KJyZ4g51As8f/nhu5yJW4JLosYRTURkeEd/hy2OiRCpL
ByY6ozhk4kUP5mlF9ZGr/ZGdPEefha+HEzQgtaYaePEHPp2/VHDKJhFU3pP+7g8d
DG3++p7GbXOz5hnBZdEHfJht3oJiN5/Uh52eCDozE5WMUbRVFIUU+Q5eBSx+HBAS
ypTJ9Iyr37cheLBTdAzZ3qphJzXbZdJqEpf/AyUmtATfhkkMgkk5qkvDmEwkrjjf
+CJyjNzGeZbOaats/8LGrCYw4W2ezX2YIMqeZDnChKQ9sGyxhC0UEStizjQXw/M=
=AKZo
-----END PGP SIGNATURE-----