[polly] ScalarEvolution not supporting SDiv, SRem [was: r244903 - Add test case for SCEV synthesizing]

Mon Aug 17 06:58:20 PDT 2015

On 08/17/2015 03:31 PM, Michael Kruse wrote:
> 2015-08-17 15:15 GMT+02:00 Tobias Grosser <tobias at grosser.es>:
>> On 08/14/2015 09:16 PM, Michael Kruse wrote:
>>> AFAIK such uses of SCEVExpander can only occur for scop parameters,
>>> but nothing within the scop. Everything from within the scop is
>>> generated from the isl_ast/isl_pw_aff.
>>
>>
>> Actually, even within the scop we are using the SCEVExpander in
>> BlockGenerator::getNewValue so this problem may even reach out
>> until there.
>
> Yes, I didn't see that.
>
>
>>> Possible solutions:
>>>
>>> 1.) Remove the extension for SDiv/SRem
>>
>>
>> Remove (or rather conditionally disable) this extension is is a valid
>> solution. It seems SDiv/SRem support is fundamentally incomplete. I
>> would be very happy if you could fix this, but it seems equally
>> fine to just disable it for now and go ahead with whatever
>> you planned to do in the first place.
>
> I expected this would be rejected right away. I must be a reason why
> this extension got implemented in the first place.

There is. SRem support is of use for stencils for example. However, as someone
(probably me) committed an incomplete/buggy implementation it seems fine
to temporarily disable it until it works such that we make sure you can
continue to follow your actual goal.

>>>
>>> 2.) Teach ScalarEvolution handle SDiv and SRem, including the
>>> introduction of SCEVSDivExpr and SCEVSRemExpr. This requires changing
>>> LLVM itself.
>>
>>
>> Right, this needs LLVM changes. The other issue with this is, that I
>> believe we will at some point hit fundamental limitations of the
>> SCEVExpander. Specifically, I hope that we can at some point have the
>> SCEVValidator allow ternary conditions, min/max expressions, ...
>>
>> Especially for ternary expressions I do not really see how we can
>> add them to SCEV and/or the SCEV expander. This really seems to
>> be out of scope.
>
> Is there some conceptual problem in ScalarEvolution what makes this impossible?

Not impossible. However, scalar evolution is mostly used for simpler loop
optimizations. It is not clear to me that the additional complexity
to reason about piecewise expressions is actually needed in this context.

> If this is the case, doesn't this mean we shouldn't use ScalarEvolution at all?

It is very useful to reason about loop induction variables. We need a little
bit more, but the hybrid approach we use looks (besides being incomplete)
pretty good to me.

>>> 4.) Fix it up after SCEVExpander: Copy instructions/expression trees
>>> that are referenced in the generated code,(but defined later) to
>>> before the generated code. This possible because SCEVValidator checks
>>> that the operands (and their operands, recursively) are
>>> scop-invariant.
>>
>>
>> Right, that could work. I did not look into this in detail yet, but
>> this seems to be the approach you have taken so far. As for now,
>> there seems to exist at least a bug in the implementation.
>
> For http://reviews.llvm.org/D12053 I chose 6.)

Ah, right.

>>> 5.) Demote SCEV's with SRem/SDiv instruction to at most IV, so they
>>> cannot appear in SCEVs outside of the loop. (Makes 4 test cases fail).
>>
>>
>> I do not understand this approach. What does it mean to demote to at most
>> "IV".
>
> The problem is that the sdiv/srem are categorized as loop-invariant

scop-invariant?

> event though are are in the loop themselves. Categorized them as
> loop-variant in SCEVValidator would get around this, but probably also
> takes away its usefulness.

Ah, now I see what you mean. Categorizing them as IV indeed does not
sound correct.

>>> 6.) Move scop-invariant SRem/SDivs to polly.split_new_and old before
>>> CodeGeneration.
>>
>>
>> Would this be a general solution? I assume this could fix the issues
>> we have currently test cases for, but probably not the ones where the
>> sdivs are loop-variant, no?
>
> Right. I actually didn't know there the SCEVExpander is invoked within
> the BlockGenerator, so I din't think thous could happen. There is a
> function generateSCEV which for some reason is not used there.

generateSCEV seems just a minor helper function to adjust the
InsertLocation.

>>> Opinions? IMHO 2.) is the cleanest. 4.) and 6.) would be a quick fixups.
>>
>>
>> I must admit, I currently have a little bias to the SCEV based approach
>> I proposed in the patch review of 4), but may be convinced that some
>> other solution might be better.
>
> Which number? They all are related to SCEVs.

Ah, 6b). Instead of analyzing ourselves what to move, we just look at the
SCEVUnknowns that are within the scop and only recreate these instructions
(but look before at the SCEVs of their operands and expand them in the same way).
Johannes patch is coming relatively close to what I had in mind. (I am not
yet sure it solves the issues for the loop-variant srem instructions, though).

> I implemented 6.) as already mostly as workaround get some a "PHI
> nodes in exit blocks" solution without lnt failures.

I think some variation of 6 (b?) is the best mid-term choice.

Best,
Tobias