[llvm-dev] Unreachable CFG subcomponent produced by jump threading

Sun Aug 15 02:37:29 PDT 2021

Hi, thanks for the reply. Here is the actual IR. Sorry I cannot create a reduced example, because the IR needs to have a connected CFG and then one component is made unreachable. This is hard to reduce. I tried but the reduced IR no longer have this characteristics. 

The actual problem is in loop deletion: 
Use still stuck around after Def is destroyed:  %12 = phi i16 [ undef, %10 ], [ <badref>, %21 ]

Reproduce by:
Opt -S -jump-threading -licm -loop-deletion xxx.ll

%12 = phi i16 [ undef, %10 ], [ <badref>, %21 ] was  %12 = phi i16 [ undef, %10 ], [ %21, %32 ] before jump threading. As described in the original message bb32 was made unreachable in jumpthreading, but this phinode remains until loop deletion. Then because this instruction is in a loop preheader, we get the failure when the loop body is deleted. (This phinode seems valid LCSSA form because LCSSA check skips unreachable components.) 

--------------------Start of actual IR----------------------
@c = internal unnamed_addr global i1 false, align 4
@b = internal unnamed_addr global i1 false, align 4
@h = internal unnamed_addr global i16* null, align 8
@f = internal unnamed_addr global i1 false, align 8
@d = internal unnamed_addr global i1 false, align 4

declare void @llvm.lifetime.start.p0i8(i64 immarg %0, i8* nocapture %1)

define internal fastcc void @j() unnamed_addr {
  %1 = alloca i16, align 4
  %2 = load i1, i1* @c, align 4
  br i1 %2, label %5, label %3

3:                                                ; preds = %0
  store i1 true, i1* @b, align 4
  store i1 true, i1* @c, align 4
  %4 = bitcast i16* %1 to i8*
  call void @llvm.lifetime.start.p0i8(i64 2, i8* nonnull %4)
  br label %10

5:                                                ; preds = %0
  %6 = load i1, i1* @b, align 4
  %7 = bitcast i16* %1 to i8*
  call void @llvm.lifetime.start.p0i8(i64 2, i8* nonnull %7)
  br i1 %6, label %10, label %8

8:                                                ; preds = %5
  store i16* %1, i16** @h, align 8
  br label %9

9:                                                ; preds = %9, %8
  br label %9

10:                                               ; preds = %5, %3
  store i1 true, i1* @d, align 4
  br label %11

11:                                               ; preds = %32, %10
  %12 = phi i16 [ undef, %10 ], [ %21, %32 ]
  %13 = load i1, i1* @f, align 8
  %14 = zext i1 %13 to i16
  br label %17

15:                                               ; preds = %22
  %16 = icmp slt i16 %23, 77
  br i1 %16, label %25, label %27

17:                                               ; preds = %25, %11
  %18 = phi i16 [ 0, %11 ], [ %26, %25 ]
  %19 = phi i16 [ %12, %11 ], [ %21, %25 ]
  br label %20

; CHECK: or
20:                                               ; preds = %17
  %21 = or i16 %19, %14
  store i16 %21, i16* %1, align 4
  br label %22

22:                                               ; preds = %20
  %23 = add i16 %18, 1
  %24 = icmp slt i16 %23, 7

25:                                               ; preds = %22, %15
  %26 = phi i16 [ %23, %22 ], [ 0, %15 ]
  br label %17, !llvm.loop !17

; This bb will be deleted because in bb15 the jump to here is removed.
27:                                               ; preds = %15
  br label %28

; This bb will become unreachable because bb27 is deleted, but it
; still has predecessor %32 so it won't be trivally deleted.
; CHECK-NOT: load i1, i1* @d
28:                                               ; preds = %27, %32
  %29 = phi i16 [ %33, %32 ], [ %23, %27 ]
  %30 = load i1, i1* @d, align 4
  br i1 %30, label %32, label %31

; This bb will become unreachable because bb27 is deleted.
; CHECK-NOT: store i1 true, i1* @f
31:                                               ; preds = %28
  store i1 true, i1* @f, align 8
  br label %32

; This bb will become unreachable because bb27 is deleted.
; CHECK-NOT: store i1 false, i1* @d
32:                                               ; preds = %31, %28
  %33 = and i16 %29, 11
  store i1 false, i1* @d, align 4
  %34 = icmp eq i16 %33, 0
  br i1 %34, label %11, label %28, !llvm.loop !19
}

!17 = distinct !{!17, !18}
!18 = !{!"llvm.loop.mustprogress"}
!19 = distinct !{!19, !18}
------------------------end of actual IR------------------------------

Any suggestion is appreciated.

Thanks,
Duanyang

-----邮件原件-----
发件人: Roman Lebedev [mailto:lebedev.ri at gmail.com] 
发送时间: 2021年8月14日 22:22
收件人: jingduanyang <jingduanyang at huawei.com>
抄送: llvm-dev at lists.llvm.org; Weiwei (weiwei, Compiler) <weiwei64 at huawei.com>; guopeilin <guopeilin1 at huawei.com>; Ehsan Amiri <ehsan.amiri at huawei.com>; Zhangwen(Esan) <zwzhangwen.zhang at huawei.com>
主题: Re: [llvm-dev] Unreachable CFG subcomponent produced by jump threading

Can you post an actual IR that can be run through -jump-threading that shows the problem?
What is the actual problem? What problems in later passes does it cause?|

Roman

On Sat, Aug 14, 2021 at 5:18 PM jingduanyang via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
>
>
> I have a program where during compilation an unreachable subcomponent of the CFG was created after an edge is removed by jump threading. Looks like this situation is not handled currently in jump threading. It would be great if anyone can give some advice.
>
>
>
> The snippet of the IR:
>
>
>
> 11:                                               ; preds = %32, %10
>
>   %12 = phi i16 [ undef, %10 ], [ %21, %32 ]   ; after jump threading, [%21, %32] will remain but %32 is already made unreachable
>
>   %13 = load i1, i1* @f, align 8
>
>   %14 = zext i1 %13 to i16
>
>   br label %17
>
>
>
> 15:                                               ; preds = %22
>
>   %16 = icmp slt i16 %23, 77
>
>   br i1 %16, label %25, label %27
>
>
>
> 17:                                               ; preds = %25, %11
>
>   %18 = phi i16 [ 0, %11 ], [ %26, %25 ]
>
>   %19 = phi i16 [ %12, %11 ], [ %21, %25 ]
>
>   br label %20
>
>
>
> ; CHECK: or
>
> 20:                                               ; preds = %17
>
>   %21 = or i16 %19, %14
>
>   store i16 %21, i16* %1, align 4, !tbaa !15
>
>   br label %22
>
>
>
> 22:                                               ; preds = %20
>
>   %23 = add i16 %18, 1
>
>   %24 = icmp slt i16 %23, 7
>
>   br i1 %24, label %25, label %15
>
>
>
> 25:                                               ; preds = %22, %15
>
>   %26 = phi i16 [ %23, %22 ], [ 0, %15 ]
>
>   br label %17, !llvm.loop !17
>
>
>
> ; This bb will be deleted because in bb15 the jump to here is removed.
>
> 27:                                               ; preds = %15
>
>   br label %28
>
>
>
> ; This bb will become unreachable because bb27 is deleted, but it
>
> ; still has predecessor %32 so it won't be trivally deleted.
>
> ; CHECK-NOT: load i1, i1* @d
>
> 28:                                               ; preds = %27, %32
>
>   %29 = phi i16 [ %33, %32 ], [ %23, %27 ]
>
>   %30 = load i1, i1* @d, align 4
>
>   br i1 %30, label %32, label %31
>
>
>
> ; This bb will become unreachable because bb27 is deleted.
>
> ; CHECK-NOT: store i1 true, i1* @f
>
> 31:                                               ; preds = %28
>
>   store i1 true, i1* @f, align 8
>
>   br label %32
>
>
>
> ; This bb will become unreachable because bb27 is deleted.
>
> ; CHECK-NOT: store i1 false, i1* @d
>
> 32:                                               ; preds = %31, %28
>
>   %33 = and i16 %29, 11
>
>   store i1 false, i1* @d, align 4
>
>   %34 = icmp eq i16 %33, 0
>
>   br i1 %34, label %11, label %28, !llvm.loop !19
>
>
>
> As described in the IR comments, the branch to bb28 will be removed by jump threading, then bb28, bb31, and bb32 will become unreachable, but %12 uses bb32 as incoming block which causes problems in later passes.
>
>
>
> In the current jump threading implementation, if an incoming edge to a block is removed and there is no predecessor left, the block will be deleted. But this does not cover the situation above, because the unreachable blocks are connected to each other. This is probably a corner case, but Iet me ask: should we enhance jump threading to capture this scenario?
>
>
>
> I can think of two fixes:
>
> 1.       Do some clean up to eliminate unreachable blocks at the end of jump threading. This seems to violate some existing assumption. E.g. I saw this comment from existing code:  “  // JumpThreading must not processes blocks unreachable from entry. It's a waste of compute time and can potentially lead to hangs.”
>
> 2.       When each block is processed, identity the above scenario and deal with it. This probably requires update of DT for every iteration? Which could be very expensive.
>
>
>
> Any advice/suggestion is greatly appreciated!
>
>
>
> Thanks,
>
> Duanyang
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev