[llvm-dev] Wrong code bug after GVN/PRE?

Mon Jan 16 02:46:59 PST 2017

Hi,

On 01/13/2017 10:29 PM, Daniel Berlin wrote:
> Yeah, there's a lot of things this could be.
>
> On the memdep side:
> Note that memdep is not actually properly updated in all cases by most
> passes that claim to not invalidate it (they don't invalidate dependent
> pointers, only pointers they directly touch).
>
> There's already a bug filed about this.

You mean
  https://llvm.org/bugs/show_bug.cgi?id=27740
?

> So far we've only seen
> missed-opt, not wrong code from this.
> But it should be possible to generate wrong code bugs with it (if you
> place the load/store in a place that invalidates the cached dependent
> result) , so i wonder if this is one.

Any idea what to do about it?

I can disable load-PRE in gvn to make this instance of the problem go 
away (albeit at a performance cost), but if there is a general problem 
with memdep I suppose it's not only load PRE where things may go wrong 
due to it?

Thanks,
Mikael

>
> Otherwise, the other reason it could give different answers after
> running a bunch of passes is the random scanning limits it has.
>
> On the PRE side:
> PHITranslateAddr goes through a bunch of crazy machinations to try to be
> able to prove things about inter-iteration variables instead of just
> doing phi translation.
>
> LoopLoadElimination was written so that these could be removed, because
> they are pretty much crazytown.
>
>
>
> --Dan
>
>
> On Fri, Jan 13, 2017 at 10:31 AM, Friedman, Eli via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     On 1/13/2017 12:31 AM, Mikael Holmén via llvm-dev wrote:
>
>         Hi,
>
>         I've stumbled upon a case where I think gvn does a bad (wrong)
>         optimization. It's a bit messy to debug though so I'm not sure
>         if I should just write a PR about it a let someone who knows the
>         code look at it instead.
>
>         Anyway, for the bug to trigger I need to run the following
>         passes in the same opt invocation:
>          -sroa -instcombine -simplifycfg -instcombine -gvn
>
>         The problem seems to be that GVN::PerformLoadPRE triggers and I
>         see a
>
>         GVN REMOVING PRE LOAD:   %_tmp79 = load i16, i16* %_tmp78, align 2
>
>         printout.
>
>         If I instead first run
>
>          -sroa -instcombine -simplifycfg -instcombine
>
>         and then
>
>          -gvn
>
>         I don't get the
>
>         GVN REMOVING PRE LOAD
>
>         printout, and the resulting code looks ok to me.
>
>         Is this expected? Should the output differ in these two cases?
>
>
>         The input to gvn looks the same when running all passes and just
>         gvn, but I see a difference in how
>         GVN::processNonLocalLoad(LoadInst *LI) behaves:
>
>         I get different results from
>
>           // Step 1: Find the non-local dependencies of the load.
>           LoadDepVect Deps;
>           MD->getNonLocalPointerDependency(LI, Deps);
>
>         So we get different results from MemoryDependenceResults when
>         invoking gvn alone or after a bunch of other passes.
>
>         Is this expected or not?
>
>
>         I tried to dig into
>         MemoryDependenceResults::getNonLocalPointerDependency but I have
>         to say I'm quite lost.
>
>         I ran some git bisect and as far as I can tell it starts going
>         wrong with commit
>
>          [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
>
>         2016-09-01, but that commit doesn't change
>         GVN/MemoryDependenceResults so I suppose it just changes the
>         input to GVN so the problem suddenly surfaces.
>
>         Finally, the problem that I see is:
>
>         In the input we have something like
>
>             for (int step1 = 0; step1 < LOOP_AMOUNT; step1++)
>             {
>                 lb[step1] = step1 + 7;
>                 ub[step1] = (step1 == 0 ? 10: ub[step1-1]*10);
>
>                 switch(lb[step1]) {
>                 case 7:    CVAL_VERIFY(step1 == 0);
>                     CVAL_VERIFY(ub[step1] == 10);
>                                 __label(511);
>                     break;
>                 case 8:    CVAL_VERIFY(step1 == 1);
>                     CVAL_VERIFY(ub[step1] == 100);
>                                 __label(512);
>                     break;
>                 case 9:    CVAL_VERIFY(step1 == 2);
>                     CVAL_VERIFY(ub[step1] == 1000);
>                                 __label(513);
>                     break;
>                 default:;
>                 }
>
>                         [...]
>             }
>
>             int i, j;
>             for ( j = 10, i = 0; j < 101; j=j*10, i++ )
>             {
>                 CVAL_VERIFY(ub[i] == j);
>             }
>
>         So we have two loops. In the first one the array ub is written
>         (low index first, high index last), and then in the second loop
>         the content of ub is checked (low index first, high index last).
>
>         After gvn this is changed into something like
>
>                 int tmp;
>             for (int step1 = 0; step1 < LOOP_AMOUNT; step1++)
>             {
>                 lb[step1] = step1 + 7;
>                 ub[step1] = (step1 == 0 ? 10: ub[step1-1]*10);
>
>                 switch(lb[step1]) {
>                 case 7:    CVAL_VERIFY(step1 == 0);
>                                 tmp = ub[step1];
>                     CVAL_VERIFY(tmp == 10);
>                                 __label(511);
>                     break;
>                 case 8:    CVAL_VERIFY(step1 == 1);
>                                 tmp = ub[step1];
>                     CVAL_VERIFY(tmp == 100);
>                                 __label(512);
>                     break;
>                 case 9:    CVAL_VERIFY(step1 == 2);
>                                 tmp = ub[step1];
>                     CVAL_VERIFY(tmp == 1000);
>                                 __label(513);
>                     break;
>                 default:;
>                 }
>
>                         [...]
>             }
>
>             int i, j;
>             for ( j = 10, i = 0; j < 101; j=j*10, i++ )
>             {
>                 CVAL_VERIFY((i == 0 ? tmp : ub[i]) == j);
>             }
>
>         Note the introduced variable tmp.
>
>         Also note that with this rewrite, after the first loop tmp will
>         contain the value of the element at the highest index in ub, and
>         then we use tmp in the first round of the second loop, but there
>         we expect the value of the lowest index element in ub.
>
>         Any ideas? Should I file a PR?
>
>
>     Ooh, wow, that's nasty: it looks like PRE is somehow deciding that
>     ub[step1] and ub[0] are equivalent.  Please file a bug.
>
>     The reason you're having trouble splitting the "opt" invocation is
>     that you aren't passing in "-preserve-ll-uselistorder"; it usually
>     doesn't matter, but occasionally you'll run into an optimization
>     which is sensitive to it.
>
>     -Eli
>
>     --
>     Employee of Qualcomm Innovation Center, Inc.
>     Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
>     Linux Foundation Collaborative Project
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>