[PATCH] D96805: [AMDGPU][CostModel] Refine cost model for control-flow instructions.

Daniil Fukalov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 16 15:38:04 PST 2021


dfukalov added a comment.

It seems to me this threshold bump partially compensated by cbr cost increase in all cases of unroll loops with ifs, where it is multiplicated by trip count.
This threshold bumped because of test/CodeGen/AMDGPU/unroll.ll, where started to fail

  ; CHECK-LABEL: @unroll_for_if
  ; CHECK: entry:
  ; CHECK-NEXT: getelementptr
  ; CHECK-NEXT: store
  ; CHECK-NEXT: getelementptr
  ; CHECK-NEXT: store
  ; CHECK-NOT: br
  define amdgpu_kernel void @unroll_for_if(i32 addrspace(5)* %a) {
  entry:
    br label %for.body
  for.body:                                         ; preds = %entry, %for.inc
    %i1 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
    %and = and i32 %i1, 1
    %tobool = icmp eq i32 %and, 0
    br i1 %tobool, label %for.inc, label %if.then
  if.then:                                          ; preds = %for.body
    %0 = sext i32 %i1 to i64
    %arrayidx = getelementptr inbounds i32, i32 addrspace(5)* %a, i64 %0
    store i32 0, i32 addrspace(5)* %arrayidx, align 4
    br label %for.inc
  for.inc:                                          ; preds = %for.body, %if.then
    %inc = add nuw nsw i32 %i1, 1
    %cmp = icmp ult i32 %inc, 48
    br i1 %cmp, label %for.body, label %for.end
  
  for.end:                                          ; preds = %for.cond
    ret void
  }

since cbr code size cost increased (needed increase to 250) plus phi became non-free (+50 to 300).
Perhaps at this time we should set cbr code size estimation not 4 but 3 (2 exec mask manipulations) and collect more statistics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96805/new/

https://reviews.llvm.org/D96805



More information about the llvm-commits mailing list