[PATCH] D154695: [Coroutines] Add an O(n) algorithm for computing the cross suspend point information.

Wed Jul 19 02:17:52 PDT 2023

witstorm95 added a comment.

@ChuanqiXu I takes some time to figure out where the cross suspend point information is used and what it does . It used to determine whether the stack variable need to be placed on Coroutine frame. So the current problem does not affect the correctness of the coroutine programs.

If we stick to the definition about KIlls, we may get the wrong result in some cases. For example(llvm/test/Transforms/Coroutines/ArgAddr.ll),

  define nonnull ptr @f(i32 %n) presplitcoroutine {
  ; CHECK-LABEL: @f(
  ; CHECK-NEXT:  entry:
  ; CHECK-NEXT:    [[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @f.resumers)
  ; CHECK-NEXT:    [[N_ADDR:%.*]] = alloca i32, align 4
  ; CHECK-NEXT:    store i32 [[N:%.*]], ptr [[N_ADDR]], align 4
  ; CHECK-NEXT:    [[CALL:%.*]] = tail call ptr @malloc(i32 24)
  ; CHECK-NEXT:    [[TMP0:%.*]] = tail call noalias nonnull ptr @llvm.coro.begin(token [[ID]], ptr [[CALL]])
  ; CHECK-NEXT:    store ptr @f.resume, ptr [[TMP0]], align 8
  ; CHECK-NEXT:    [[DESTROY_ADDR:%.*]] = getelementptr inbounds [[F_FRAME:%.*]], ptr [[TMP0]], i32 0, i32 1
  ; CHECK-NEXT:    store ptr @f.destroy, ptr [[DESTROY_ADDR]], align 8
  ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds [[F_FRAME]], ptr [[TMP0]], i32 0, i32 2
  ; CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[N_ADDR]], align 4
  ; CHECK-NEXT:    store i32 [[TMP2]], ptr [[TMP1]], align 4
  ;
  entry:
    %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null);
    %n.addr = alloca i32
    store i32 %n, ptr %n.addr ; this needs to go after coro.begin
    %0 = tail call i32 @llvm.coro.size.i32()
    %call = tail call ptr @malloc(i32 %0)
    %1 = tail call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %call)
    call void @ctor(ptr %n.addr)
    br label %for.cond

  for.cond:
    %2 = load i32, ptr %n.add
    %dec = add nsw i32 %2, -1
    store i32 %dec, ptr %n.addr
    call void @print(i32 %2)
    %3 = call i8 @llvm.coro.suspend(token none, i1 false)
    %conv = sext i8 %3 to i32
    switch i32 %conv, label %coro_Suspend [
    i32 0, label %for.cond
    i32 1, label %coro_Cleanup
    ]

  coro_Cleanup:
    %4 = call ptr @llvm.coro.free(token %id, ptr nonnull %1)
    call void @free(ptr %4)
    br label %coro_Suspend

  coro_Suspend:
    call i1 @llvm.coro.end(ptr null, i1 false)
    ret ptr %1
  }

Apparently, stack variable %n.addr should be placed on Coroutine frame as it be used in for.cond block that contains suspend point. Right ? But If we stick to the definition about KIlls, we can't get this result. Why? Because splitAround will split for.cond into 4 blocks(for.cond, CoroSave, CoroSuspend, AfterSuspend) before the cross suspend point information is computed. And splitAround will lead some info lost. Here is CFG about function f,

F28318563: 1.jpg <https://reviews.llvm.org/F28318563>

Code of each block as follow,

  entry:
    %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
    %n.addr = alloca i32, align 4
    store i32 %n, ptr %n.addr, align 4
    %0 = tail call i32 @llvm.coro.size.i32()
    %call = tail call ptr @malloc(i32 %0)
    %1 = tail call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %call)
    call void @ctor(ptr %n.addr)
    br label %for.cond

  for.cond:                                         ; preds = %AfterCoroSuspend, %entry
    %2 = load i32, ptr %n.addr, align 4
    %dec = add nsw i32 %2, -1
    store i32 %dec, ptr %n.addr, align 4
    call void @print(i32 %2)
    br label %CoroSave

  coro_Suspend:                                     ; preds = %coro_Cleanup, %AfterCoroSuspend
    br label %CoroEnd

  coro_Cleanup:                                     ; preds = %AfterCoroSuspend
    %5 = call ptr @llvm.coro.free(token %id, ptr nonnull %1)
    call void @free(ptr %5)
    br label %coro_Suspend

  CoroSave:                                         ; preds = %for.cond
    %3 = call token @llvm.coro.save(ptr %1)
    br label %CoroSuspend

  CoroSuspend:                                      ; preds = %CoroSave
    %4 = call i8 @llvm.coro.suspend(token %3, i1 false)
    br label %AfterCoroSuspend

  AfterCoroSuspend:                                 ; preds = %CoroSuspend
    %conv = sext i8 %4 to i32
    switch i32 %conv, label %coro_Suspend [
      i32 0, label %for.cond
      i32 1, label %coro_Cleanup
    ]

  CoroEnd:                                          ; preds = %coro_Suspend
    %6 = call i1 @llvm.coro.end(ptr null, i1 false)
    br label %AfterCoroEnd

  AfterCoroEnd:                                     ; preds = %CoroEnd
    ret ptr %1

Apparently, for.cond.Kills[Entry] is false, it means %n.addr not crossing a suspend point. So %n.addr will not be placed on Coroutine frame. But it should be placed on Coroutine frame actually.

If I don't understand it correctly, pelase help to point out.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154695/new/

https://reviews.llvm.org/D154695