[polly] r264118 - [ScopInfo] Fix domains after loops.

Mon Apr 4 08:20:43 PDT 2016

2016-04-04 16:09 GMT+02:00 Johannes Doerfert <doerfert at cs.uni-saarland.de>:
>> Don't we have a separate context that we know that is true due to
>> otherwise undefined behaviour, but don't add runtime checks for it?
> We don't have such a context, mainly because we "do not need it" but also
> because "it would always be incomplete". I do not want to say we could
> not use that for user feedback but it is not something I would always
> compute just for the sake of it. Nevertheless, I am strongly supporting
> Polly to exploit undefined behaviour.

Nothing wrong with exploiting undefined behaviour but we should make
sure it doesn't break invariants.

I think such a context would be useful. We could gist the domains/RTCs
over it and hence simplifying them.

>> >> >In all cases, I think we should add domain lower and upper bound only
>> >> >together and we shouldn't allow ISL to add new contraints when leaving
>> >> >the loop. It might find other contraints in  [p] -> { [i0] : i0 > 0
>> >> >and <contraints found in the loop> }.
>> >>
>> >> I think it might be good to do what you did in your patch, but before moving
>> >> ahead I would like to understand if this is a correctness issue
>> >> or just constraints that we can make simpler.
>> > There are different things here.
>> >   1) Adding lower and upper bounds only together is not as simple as
>> >      there is no single "lower" and "upper" bound but only pieces which
>> >      cannot be looked up somewhere.
>>·
>> I don't understand this argument. This is what
>> Scop::addLoopBoundsToHeaderDomain() does.
> Close but it doesn't provide a "upper bound value" and it uses the
> propagated domain to come up with some "upper bound function" in the
> first place. This function is actually close to the
> propagateDomainConstrains function but it also deals with the fact that
> there are different loop iterations involved. As this is a complicated
> topic we might want to start a new thread for this where you can explain
> the problem of the current approach [only if it can be isolated from
> this thread].
>

OK.

>> >   2) I agree that we might not want to keep constraints from inside a
>> >      loop but I am not even sure about it. Since I don't think we have
>> >      a correctness issue either way I would like to go forward only with
>> >      a motivating example that profits from this.
>>·
>> It is a correctness issue in the LNT Adobe-C++ loop_unroll test case.
> So you are positive that the domain is __incorrect__ for this test case
> and that it is the root cause of the failing test? If so I will take a
> closer look into that test [though it seems we pass all lnt test since
> today].

It is incorrect if you agree that

Context \cap dom(EntryBB) == Context \cap \bigcup_{E \in LoopExits} dom(E)

is an invariant that should hold fer every loop. E.g. CodeGeneration
implicitly assumes this.

> By the way, what were the symptoms of that failure? Compile or execution
> time failure? Violated assertion or just crash?

Invalid IR generation. It would need to generate an exit-node PHI to
distinguish the existing blocks of two incoming BBs of (the same)
non-affine subregion. The subregion has been removed because it is an
error block. As the subregion is executed unconditionally,
CodeGeneration should be rightly able to assume its existence.

> There is a known bug in
> the RTCs [PR26683] that is currently my goto candidate if we crash at
> execution time but should not. However, we also have/had a couple of
> other bugs that are open [PR27013, PR27195] or have been fixed recently
> [r265286, r265280, r265261, r265132]. Maybe one of them was the culprit.
>
>> >> Your example above suggests that you have a correctness issue in mind. As
>> >> said above, do we already have (or can we draft) such a test case?
>> >>
>> >> > D18450 cannot help when the loop
>> >> >has multiple exits.
>> >>
>> >> Agreed. And for whatever reason it also did not fix the bug we see on LNT.
>> > That statement is not helping. AFAIK, there is little evidence that the
>> > additional constraints cause the lnt failure.
>> Huh?
> Correct me if I am wrong but I understood the chaing of events like like this:
>   1) We had LNT failures in some build.
>   2) Your [FIX] Domain ... patch resulted in a "different" domain that
>      caused us to dismiss SCoPs statically that would/should have a
>      "false" RTC at runtime anyway but for which we could not show that
>      with the "former" domain.
>   3) The LNT tests were now passing.
>
> If this listing of events is "accurate" I am not convinced that the problem was
> actually the domain. Instead I think it might have been in the code generation
> that was triggered before but not after your patch.  While code generation
> should not change the semantics as the RTCs should be false anyway, it is
> possible that this happend. A bug in the RTCs or any other code generation
> part could cause problems even if the domains are fine and it would be hidden
> if we can now statically drop the SCoPs.

Commit r265285 overwrites the domain to the same domain as this patch
would compute.

>> > From what I understand,
>> > the additional constrains prevent us from statically dropping the SCoP,
>> > thus it mainly will hide bugs in all parts after the ScopInfo.
>>
>> The SCoP _must_ be dropped because of an unconditional "error"
>> subregion (it calls printf). Not dropping it would either make us
>> transform a scop with a function call or miscompile because the error
>> subregion computes an escaping value.
>
> We never have to "drop" a SCoP statically, that is just an optimization we
> perform. What we have to do is to make sure the "Polly optimized version" is
> only executed if all assumptions were true at runtime. If we can statically
> show that this will never be the case there is no need in generating the "Polly
> optimized version", but it should never be a problem to do so as the RTCs are
> supposed to guard the execution. Do not get me wrong, we have bugs in the RTCs
> and the code generation all the time, thus this is more a theoretical argument
> but it is one nevertheless.

I think the we cannot regards dropping scops statically in the case of
error blocks. Executing the generated scop in a context that would
execute the error block would be invalid. In our case, the error block
is executed unconditionally, hence we should never execute the
generated code.

It is a bit more complicated because undefined behaviour is involved.
The RTCs do execute the Polly code only for a setting that invokes
undefined behaviour in the loop, hence Tobias convinced me that this
is not a miscompilation. However, this ub violates invariants assumed
by later code. Here: Domains of entering and exiting a non-infinite
loop are not equal, always-executed statements that do not have
complete domain, and optimizing away a block that is executed
unconditionally. There could be more such invariants that could cause
more problems.

In math, it would be a "Ex falso quodlibet" situation. We have an
impossible/ub-condition for entering the generated code. Any
transformation would be valid for it as it is never/only in
ub-conditions executed and I don't think our passes are prepared for
that. It would be easiest to just discard it (IMHO).

Michael