[PATCH] D96805: [AMDGPU][CostModel] Refine cost model for control-flow instructions.

Tue Feb 16 15:38:04 PST 2021

dfukalov added a comment.

It seems to me this threshold bump partially compensated by cbr cost increase in all cases of unroll loops with ifs, where it is multiplicated by trip count.
This threshold bumped because of test/CodeGen/AMDGPU/unroll.ll, where started to fail

  ; CHECK-LABEL: @unroll_for_if
  ; CHECK: entry:
  ; CHECK-NEXT: getelementptr
  ; CHECK-NEXT: store
  ; CHECK-NEXT: getelementptr
  ; CHECK-NEXT: store
  ; CHECK-NOT: br
  define amdgpu_kernel void @unroll_for_if(i32 addrspace(5)* %a) {
  entry:
    br label %for.body
  for.body:                                         ; preds = %entry, %for.inc
    %i1 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
    %and = and i32 %i1, 1
    %tobool = icmp eq i32 %and, 0
    br i1 %tobool, label %for.inc, label %if.then
  if.then:                                          ; preds = %for.body
    %0 = sext i32 %i1 to i64
    %arrayidx = getelementptr inbounds i32, i32 addrspace(5)* %a, i64 %0
    store i32 0, i32 addrspace(5)* %arrayidx, align 4
    br label %for.inc
  for.inc:                                          ; preds = %for.body, %if.then
    %inc = add nuw nsw i32 %i1, 1
    %cmp = icmp ult i32 %inc, 48
    br i1 %cmp, label %for.body, label %for.end

  for.end:                                          ; preds = %for.cond
    ret void
  }

since cbr code size cost increased (needed increase to 250) plus phi became non-free (+50 to 300).
Perhaps at this time we should set cbr code size estimation not 4 but 3 (2 exec mask manipulations) and collect more statistics.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96805/new/

https://reviews.llvm.org/D96805