[llvm-dev] Wrong code bug after GVN/PRE?
Daniel Berlin via llvm-dev
llvm-dev at lists.llvm.org
Fri Jan 13 13:29:15 PST 2017
Yeah, there's a lot of things this could be.
On the memdep side:
Note that memdep is not actually properly updated in all cases by most
passes that claim to not invalidate it (they don't invalidate dependent
pointers, only pointers they directly touch).
There's already a bug filed about this. So far we've only seen missed-opt,
not wrong code from this.
But it should be possible to generate wrong code bugs with it (if you place
the load/store in a place that invalidates the cached dependent result) ,
so i wonder if this is one.
Otherwise, the other reason it could give different answers after running a
bunch of passes is the random scanning limits it has.
On the PRE side:
PHITranslateAddr goes through a bunch of crazy machinations to try to be
able to prove things about inter-iteration variables instead of just doing
phi translation.
LoopLoadElimination was written so that these could be removed, because
they are pretty much crazytown.
--Dan
On Fri, Jan 13, 2017 at 10:31 AM, Friedman, Eli via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On 1/13/2017 12:31 AM, Mikael Holmén via llvm-dev wrote:
>
>> Hi,
>>
>> I've stumbled upon a case where I think gvn does a bad (wrong)
>> optimization. It's a bit messy to debug though so I'm not sure if I should
>> just write a PR about it a let someone who knows the code look at it
>> instead.
>>
>> Anyway, for the bug to trigger I need to run the following passes in the
>> same opt invocation:
>> -sroa -instcombine -simplifycfg -instcombine -gvn
>>
>> The problem seems to be that GVN::PerformLoadPRE triggers and I see a
>>
>> GVN REMOVING PRE LOAD: %_tmp79 = load i16, i16* %_tmp78, align 2
>>
>> printout.
>>
>> If I instead first run
>>
>> -sroa -instcombine -simplifycfg -instcombine
>>
>> and then
>>
>> -gvn
>>
>> I don't get the
>>
>> GVN REMOVING PRE LOAD
>>
>> printout, and the resulting code looks ok to me.
>>
>> Is this expected? Should the output differ in these two cases?
>>
>>
>> The input to gvn looks the same when running all passes and just gvn, but
>> I see a difference in how GVN::processNonLocalLoad(LoadInst *LI) behaves:
>>
>> I get different results from
>>
>> // Step 1: Find the non-local dependencies of the load.
>> LoadDepVect Deps;
>> MD->getNonLocalPointerDependency(LI, Deps);
>>
>> So we get different results from MemoryDependenceResults when invoking
>> gvn alone or after a bunch of other passes.
>>
>> Is this expected or not?
>>
>>
>> I tried to dig into MemoryDependenceResults::getNonLocalPointerDependency
>> but I have to say I'm quite lost.
>>
>> I ran some git bisect and as far as I can tell it starts going wrong with
>> commit
>>
>> [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
>>
>> 2016-09-01, but that commit doesn't change GVN/MemoryDependenceResults so
>> I suppose it just changes the input to GVN so the problem suddenly surfaces.
>>
>> Finally, the problem that I see is:
>>
>> In the input we have something like
>>
>> for (int step1 = 0; step1 < LOOP_AMOUNT; step1++)
>> {
>> lb[step1] = step1 + 7;
>> ub[step1] = (step1 == 0 ? 10: ub[step1-1]*10);
>>
>> switch(lb[step1]) {
>> case 7: CVAL_VERIFY(step1 == 0);
>> CVAL_VERIFY(ub[step1] == 10);
>> __label(511);
>> break;
>> case 8: CVAL_VERIFY(step1 == 1);
>> CVAL_VERIFY(ub[step1] == 100);
>> __label(512);
>> break;
>> case 9: CVAL_VERIFY(step1 == 2);
>> CVAL_VERIFY(ub[step1] == 1000);
>> __label(513);
>> break;
>> default:;
>> }
>>
>> [...]
>> }
>>
>> int i, j;
>> for ( j = 10, i = 0; j < 101; j=j*10, i++ )
>> {
>> CVAL_VERIFY(ub[i] == j);
>> }
>>
>> So we have two loops. In the first one the array ub is written (low index
>> first, high index last), and then in the second loop the content of ub is
>> checked (low index first, high index last).
>>
>> After gvn this is changed into something like
>>
>> int tmp;
>> for (int step1 = 0; step1 < LOOP_AMOUNT; step1++)
>> {
>> lb[step1] = step1 + 7;
>> ub[step1] = (step1 == 0 ? 10: ub[step1-1]*10);
>>
>> switch(lb[step1]) {
>> case 7: CVAL_VERIFY(step1 == 0);
>> tmp = ub[step1];
>> CVAL_VERIFY(tmp == 10);
>> __label(511);
>> break;
>> case 8: CVAL_VERIFY(step1 == 1);
>> tmp = ub[step1];
>> CVAL_VERIFY(tmp == 100);
>> __label(512);
>> break;
>> case 9: CVAL_VERIFY(step1 == 2);
>> tmp = ub[step1];
>> CVAL_VERIFY(tmp == 1000);
>> __label(513);
>> break;
>> default:;
>> }
>>
>> [...]
>> }
>>
>> int i, j;
>> for ( j = 10, i = 0; j < 101; j=j*10, i++ )
>> {
>> CVAL_VERIFY((i == 0 ? tmp : ub[i]) == j);
>> }
>>
>> Note the introduced variable tmp.
>>
>> Also note that with this rewrite, after the first loop tmp will contain
>> the value of the element at the highest index in ub, and then we use tmp in
>> the first round of the second loop, but there we expect the value of the
>> lowest index element in ub.
>>
>> Any ideas? Should I file a PR?
>>
>
> Ooh, wow, that's nasty: it looks like PRE is somehow deciding that
> ub[step1] and ub[0] are equivalent. Please file a bug.
>
> The reason you're having trouble splitting the "opt" invocation is that
> you aren't passing in "-preserve-ll-uselistorder"; it usually doesn't
> matter, but occasionally you'll run into an optimization which is sensitive
> to it.
>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170113/ca4313cf/attachment.html>
More information about the llvm-dev
mailing list