[PATCH] D104870: [SimplifyCFG] Tail-merging all blocks with `unreachable` terminator

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 30 00:19:06 PDT 2021


lebedev.ri marked an inline comment as done.
lebedev.ri added a comment.

@aeubanks thank you for taking a look!



================
Comment at: llvm/test/Transforms/SimplifyCFG/tail-merge-noreturn.ll:129-130
 ; CHECK:       a2:
 ; CHECK-NEXT:    call void @assert_fail_1_alt(i32 0)
 ; CHECK-NEXT:    unreachable
 ;
----------------
aeubanks wrote:
> lebedev.ri wrote:
> > rnk wrote:
> > > lebedev.ri wrote:
> > > > rnk wrote:
> > > > > lebedev.ri wrote:
> > > > > > rnk wrote:
> > > > > > > lebedev.ri wrote:
> > > > > > > > rnk wrote:
> > > > > > > > > I expected your code to fire on this test case. Can you explain why this example isn't getting tail merged?
> > > > > > > > > 
> > > > > > > > > Consider this example: https://gcc.godbolt.org/z/ox16a9P1z
> > > > > > > > > ```
> > > > > > > > > [[noreturn]] void abort1();
> > > > > > > > > [[noreturn]] void abort2();
> > > > > > > > > [[noreturn]] void abort3();
> > > > > > > > > bool cond();
> > > > > > > > > void doAsserts() {
> > > > > > > > >     if (cond()) abort1();
> > > > > > > > >     if (cond()) abort2();
> > > > > > > > >     if (cond()) abort3();
> > > > > > > > > }
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > I think it is more canonical to leave these unreachable terminators in place after the calls to noreturn functions, rather than merging the unreachables together.
> > > > > > > > > 
> > > > > > > > > I just want to make sure your transform isn't firing, creating BBs, and then a later part of simplifycfg rolls the unreachables back up into place after the calls.
> > > > > > > > > I expected your code to fire on this test case. Can you explain why this example isn't getting tail merged?
> > > > > > > > 
> > > > > > > > It fired, we didn't sink anything, and `SimplifyCFGOpt::simplifyUnreachable()` decided to undo it.
> > > > > > > Got it, and we want to avoid that because otherwise it will make the overall pass return true to indicate that it changed something, which will make the parent pass manager re-run more passes.
> > > > > > True.
> > > > > > 
> > > > > > As far as i'm aware that only results in potentially invalidating analysises,
> > > > > > i'm not aware of that triggering another optimization pass runs.
> > > > > > 
> > > > > > IIRC that part of `SimplifyCFGOpt::simplifyUnreachable()` is a pretty important canonicalization,
> > > > > > because e.g. instcombine can't modify cfg.
> > > > > Even if it only invalidates analyses, I think this is worth addressing before landing this. Ideally this code would directly call the heuristic that "sink from common predecessors" uses, but if that isn't available, I think you could approximate it by not merging unreachable terminators when the previous non-debug instruction is a noreturn call with distinct callees. We know that is unprofitable, and that accounts for most blocks ending in unreachable. It saves compile time from IR churn too.
> > > > > 
> > > > > @aeubanks, what are the consequences of passes indicating that they changed the IR when they actually didn't?
> > > > This is impossible to address.
> > > > 
> > > > > I think you could approximate it by not merging unreachable terminators when the previous non-debug instruction is a noreturn call with distinct callees.
> > > > 
> > > > I can not, because fixing lack of sinking in such cases is basically the very next step here.
> > > > I can not, because fixing lack of sinking in such cases is basically the very next step here.
> > > 
> > > This is the transform I'm talking about avoiding, and I don't think we plan to do this in the next step:
> > > ```
> > > bb1:
> > >   call void @abort1()
> > >   unreachable
> > > bb2:
> > >   call void @abort2()
> > >   unreachable
> > > ```
> > > ->
> > > ```
> > > bb1:
> > >   br label %common
> > > bb2:
> > >   br label %common
> > > common:
> > >   %callee = phi ... @abort1 ... @abort2
> > >   call void ... %callee()
> > >   unreachable
> > > ```
> > > Right? This would make direct calls indirect, which is less canonical. I doubt this is going to change soon.
> > > 
> > > > This is impossible to address.
> > > 
> > > I guess what you are saying is that this isn't possible to implement with the current code and data structures. We'd need to incorporate the structure of the instruction before unreachable into the map. This makes me think maybe it would be better to extend the SinkFromCommonPredecessors logic to consider blocks ending in unreachable. The current code is essentially restructuring the CFG in a way that is convenient for that function. I think transforms should avoid changing the IR before they know if transformation is really profitable, and it seems like the profitability heuristic is over there.
> > > Right? This would make direct calls indirect, which is less canonical. I doubt this is going to change soon.
> > 
> > Right. This isn't going to change.
> > However, consider
> > ```
> > bb1:
> >   call void @abort1()
> >   unreachable
> > bb2:
> >   call void @abort2()
> >   unreachable
> > bb3:
> >   call void @abort2()
> >   unreachable
> > ```
> > ->
> > ```
> > bb1:
> >   call void @abort1()
> >   unreachable
> > bb2:
> >   br label %bb2.bb3.common
> > bb2.bb3.common:
> >   call void @abort2()
> >   unreachable
> > ```
> > 
> > Also, consider:
> > ```
> > bb1:
> >   call void @abort1()
> >   unreachable
> > bb2:
> >   call void @abort1()
> >   br label %bb3
> > bb3:
> >   unreachable
> > ```
> > 
> > My point being, we can't realistically say that we will/won't succeed in sinking stuff.
> there's no correctness issue with saying that we modified IR if we didn't actually, it'll just invalidate analyses, causing more work when they are recomputed in later passes
> 
> might be worth putting this through http://llvm-compile-time-tracker.com/
Since you asked, sure: https://llvm-compile-time-tracker.com/compare.php?from=1f169a774cb865659cefe085e70a56a884e3711e&to=fc54bb9a8ef85bd76dd9e934b2546f4beadc5b5e&stat=instructions
I'm not sure what this tells us here.
Since the instruction stat correlates with the size changes, i guess we could say that it lead to more inlining,
and more IR to chew through. Which is pretty much the expected outcome.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104870/new/

https://reviews.llvm.org/D104870



More information about the llvm-commits mailing list