[PATCH] PR17473: Disallow LSR to perform non-invertible normalization

Mon Mar 17 18:04:07 PDT 2014

Agree with Andy's comments on the examples. I always find too much
detail to be better than not enough :)

Thanks!

-eric

On Mon, Mar 17, 2014 at 5:54 PM, Michael Zolotukhin
<mzolotukhin at apple.com> wrote:
> Hi Andy,
>
> Here is an updated version. Could you please take a look if the added comment is what you asked for?
>
>
>
>
> I’d also like to wait a little bit to give other reviewers a chance to provide their feedback on this.
>
> Also, I filed a bug on trunc normalization:
> http://llvm.org/bugs/show_bug.cgi?id=19168
>
> Thanks,
> Michael
>
> On Mar 17, 2014, at 3:45 PM, Andrew Trick <atrick at apple.com> wrote:
>
>>
>> On Mar 17, 2014, at 3:36 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:
>>
>>> I added a test to show that the patch actual does something.
>>>
>>> <stride-norm-v2.patch>
>>
>> Since you’re using -debug, I think you should have this line at the top of the test:
>> ; REQUIRES: asserts
>>
>> The motivation for this fix is quite tricky. Could you make some attempt to comment the code with a small example of how the normalization is incorrect if we fail to normalize the stride?
>>
>> Otherwise, good. Go ahead and checkin.
>>
>> -Andy
>>
>>>
>>> Thanks,
>>> Michael
>>>
>>> On Mar 17, 2014, at 12:46 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Here is the stride normalization fix. Check-all passed, performance testing didn’t show any changes.
>>>> Is it ok for trunk?
>>>>
>>>> <stride-norm.patch>
>>>>
>>>> Thanks,
>>>> Michael
>>>>
>>>> On Mar 11, 2014, at 6:32 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>>
>>>>> On Tue, Mar 11, 2014 at 6:28 PM, Andrew Trick <atrick at apple.com> wrote:
>>>>>
>>>>> On Mar 11, 2014, at 4:31 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:
>>>>>
>>>>>> Hi Andy,
>>>>>>
>>>>>> I’ve run specs+test-suite with the same compiler (which aborts when LSR wants to discard a user), and got the following list of fails:
>>>>>>
>>>>>> New Failures - Compile Time ▾
>>>>>> External/SPEC/CINT2000/254_gap/254_gap
>>>>>> External/SPEC/CINT2000/255_vortex/255_vortex
>>>>>> External/SPEC/CINT2006/400_perlbench/400_perlbench
>>>>>> External/SPEC/CINT95/147_vortex/147_vortex
>>>>>> MultiSource/Applications/ClamAV/clamscan
>>>>>> MultiSource/Benchmarks/7zip/7zip-benchmark
>>>>>> MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode
>>>>>>
>>>>>> Then I tried a possible fix, that we discussed yesterday: I added normalization of the stride before subtracting/adding it during NormalizeAutodetect/Denormalize. However, that fixed only one fail, the one from ‘make check’, described in the previous mail. The fails in specs and tests from test-suite persisted.
>>>>>>
>>>>>> I again looked at a couple of expressions, which we are failing to denormalize to the original version, and they were like this one:
>>>>>>   ORIGINAL ISE:   {(trunc i64 (1 + %i.0.ph) to i32),+,1}<%while.cond>
>>>>>>   NORMALIZED ISE:         {(trunc i64      %i.0.ph  to i32),+,1}<%while.cond>
>>>>>>   DENORMALIZED ISE:       {(1 + (trunc i64 %i.0.ph to i32)),+,1}<%while.cond>
>>>>>>
>>>>>> I.e. we didn’t move constants to trunc. I suppose this behavior is correct.
>>>>>>
>>>>>> Thus, I think we could commit the original patch, and then I’ll send a patch, which adds step normalization. What do you think?
>>>>>
>>>>> I was aware that SCEV is unable to canonicalize truncs. That's what made me nervous about this patch, which assumes canonical expressions. However, I don’t think this problem is worth changing your fix. I agree:
>>>>>
>>>>> (1) checkin this patch to get the bug fix in
>>>>>
>>>>> (2) follow up with the stride normalization fix
>>>>>
>>>>> and I would add
>>>>>
>>>>> (3) File a radar to investigate normalizing truncs, explaining how to easily reproduce the problem. We could probably fix this issue, but I’m nervous about it affecting SCEV consumers that already assume certain behavior. It will need even more in depth performance analysis.
>>>>>
>>>>>
>>>>> Ahem. "File an external bug report and clone that into a radar." :)
>>>>>
>>>>> -eric
>>>>>
>>>>> -Andy
>>>>>
>>>>>> Thanks,
>>>>>> Michael
>>>>>>
>>>>>> On Mar 10, 2014, at 5:55 PM, Andrew Trick <atrick at apple.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Mar 10, 2014, at 5:12 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:
>>>>>>>
>>>>>>>> I’ve run 'make check-all’ with llvm_unreachable in the branch where we discard a user, and actually got some interesting result. There were two failing tests, one of which is obviously the just added test.
>>>>>>>>
>>>>>>>> The second one is CodeGen/X86/lsr-normalization.ll, and here is what I got there:
>>>>>>>>
>>>>>>>>   ORIGINAL ISE:                                 {(100 /u {1,+,1}<%bb16>),+,(100 /u {1,+,1}<%bb16>)}<%bb25>
>>>>>>>>   NORMALIZED ISE:                       {((-1 * (100 /u {1,+,1}<%bb16>)) + (100 /u {0,+,1}<%bb16>)),+,(100 /u {0,+,1}<%bb16>)}<%bb25>
>>>>>>>>   DENORMALIZED BACK ISE:        {((2 * (100 /u {1,+,1}<%bb16>)) + (-1 * (100 /u {2,+,1}<%bb16>))),+,(100 /u {1,+,1}<%bb16>)}<%bb25>
>>>>>>>>
>>>>>>>> This could give an idea of what opportunities we could lose due to this change. Maybe we need to act so conservatively only in danger of overflow, i.e. when the original expression has sign-extensions/wrap-flags?
>>>>>>>
>>>>>>> Your current fix is conservatively correct. However, it also caught a more pervasive bug (the original bug only had to do with overflow at the induction variable’s boundary). As a follow-up, I agree with your suggestion to fix normalization so that it recursively normalizes the stride before subtracting it from the initial value. Your normalize-denormalize round trip check should also be sufficient to test that fix (you should only be eliminating cases in which the check fails, not adding any new ones).
>>>>>>>
>>>>>>> -Andy
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Michael
>>>>>>>>
>>>>>>>> On Mar 10, 2014, at 5:01 PM, Andrew Trick <atrick at apple.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mar 10, 2014, at 5:00 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>> As we discussed, once you have finished running the test-suite with a check
>>>>>>>>>>> to find out if any important cases are no longer normalized as expected,
>>>>>>>>>>> then I think you can commit. I know you've already verified performance, so
>>>>>>>>>>> this is just an extra sanity check.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Any interesting results?
>>>>>>>>>
>>>>>>>>> No performance changes, just fixing PR17473.
>>>>>>>>>
>>>>>>>>> -Andy
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -eric
>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> -Andy
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Michael
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Arnold
>>>>>>>>>>>
>>>>>>>>>>> On Mar 10, 2014, at 2:56 PM, Michael Zolotukhin <mzolotukhin at apple.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> This is a fix for PR17473. The issue is that LSR performs normalization of
>>>>>>>>>>> detected IV users and sometimes wants to denormalize it back hoping to get
>>>>>>>>>>> the original expression. But normalization isn't always invertible, and the
>>>>>>>>>>> new expression might be not equivalent to the original one. In the test
>>>>>>>>>>> case, we were losing sign-extension in such transformation. The patch simply
>>>>>>>>>>> adds a check, if the transformation is invertible.
>>>>>>>>>>>
>>>>>>>>>>> <pr17473.patch>
>>>>>>>>>>>
>>>>>>>>>>> Ok for trunk?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Michael
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>