[PATCH] D104445: [SimplifyCFGPass] Tail-merging function-terminating blocks

Fri Jun 18 11:01:47 PDT 2021

lebedev.ri added a comment.

Thank you for taking a look.
Thinking about it, this change should be split up into NFC refactor,
removal of "block must be empty" check,
and several patches to enable each terminator opcode.

In D104445#2827485 <https://reviews.llvm.org/D104445#2827485>, @rnk wrote:

> I stopped pursuing D29428 <https://reviews.llvm.org/D29428> for a few reasons:
>
> - it makes IR smaller, which causes more inlining, which regresses code size, the opposite of my goal =P

I have mentioned that in the description - as we've now established, that is expected and is an improvement.

> - it removes debug info

This change certainly doesn't remove debuginfo, and i'm not sure if the generic sinking code is debuginfo-lossy, i

> - it makes it harder to form regions later in the backend
>
> Merging blocks that end in unreachable can make things like stack variable lifetime analysis harder. This example surprised me and didn't illustrate my point, but it's still worth thinking about. Consider:
>
>   $ cat t.cpp 
>   struct LargeObject {
>     int a[1024];
>     void escapeObjectA();
>     void escapeObjectB();
>   };
>   bool cond();
>   void __attribute__((noreturn)) assertFail();
>   void foo() {
>     if (cond()) {
>       LargeObject o1;
>       o1.escapeObjectA();
>       if (cond())
>         assertFail();
>     } else {
>       LargeObject o2;
>       o2.escapeObjectB();
>       if (cond())
>         assertFail();
>     }
>   }
>
> If you merge together the calls to `assertFail`, that creates BBs where `o1` and `o2` are both live. This could confuse the stack coloring pass. Somehow, however, it seems to make no difference. I modified the IR to merge the blocks, and we still get the same stack size:
>
>   $ clang -S  -O2 t.cpp  -o - |& grep sub.*rsp
>           subq    $4104, %rsp                     # imm = 0x1008
>   
>   $ clang -S -emit-llvm -O2 t.cpp  -o t.ll
>   
>   # Manually merge noreturn blocks in t.ll
>   
>   $ llc t.ll  -o - | grep sub.*rsp
>           subq    $4104, %rsp                     # imm = 0x1008
>
> So, maybe it is fine. We merge the MBBs at the MI layer anyway, but that could happen after assigning stack offsets.
>
> GPU targets, wasm, and Windows EH preparation may need to undo this type of region-breaking transform late in the backend, and if they do, the only thing this transform is doing is throwing away debug info.

Yeah, no idea what any of this means, so i can't really comment on this point.

> If you limit this change to return blocks, that should completely address most regional concerns.

By "return blocks", you literally mean blocks that end with `ret`, but neither `resume` nor `unreachable`, correct?

> I would also like to see this change for noreturn / unreachable blocks, but I think it's a longer path from here to there.

Could you please be more specific about that path? I think i'm most interested in `resume`, actually.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104445/new/

https://reviews.llvm.org/D104445