[PATCH] D76861: [AMDGPU] Fix getEUsPerCU for gfx10 in CU mode

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 23 06:00:46 PDT 2022


foad added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll:3
+; -regalloc=fast just makes the test run faster
+; RUN: llc -march=amdgcn -mcpu=gfx900 -amdgpu-function-calls=false -enable-misched=false -regalloc=fast < %s | FileCheck %s --check-prefixes=GCN,GFX9
+; RUN: llc -march=amdgcn -mcpu=gfx1010 -amdgpu-function-calls=false -enable-misched=false -regalloc=fast < %s | FileCheck %s --check-prefixes=GCN,GFX10WGP-WAVE32
----------------
LuoYuanke wrote:
> Specifying `-regalloc=fast` is not reliable. With fast register allocation, `LIS = getAnalysisIfAvailable<LiveIntervals>();` get nullptr in "si-lower-sgpr-spills" pass, so the slot index is not created in the pass for new inserted instructions. When verifying the machine intruction, it fails on checking slot index. It can be reproduced with below test case. Is it possible to use greedy-ra and reduce the compiling time for this test case?
> 
> 
> ```
> define internal void @use256vgprs() {
>   %v0 = call i32 asm sideeffect "; def $0", "=v"()
>   %v1 = call i32 asm sideeffect "; def $0", "=v"()
>   call void asm sideeffect "; use $0", "v"(i32 %v0)
>   call void asm sideeffect "; use $0", "v"(i32 %v1)
>   ret void
> }
> 
> define amdgpu_kernel void @f256() #256 {
>   call void @use256vgprs()
>   ret void
> }
> attributes #256 = { nounwind "amdgpu-flat-work-group-size"="256,256" }
> 
> define amdgpu_kernel void @f512() #512 {
>   call void @foo()
>   call void @use256vgprs()
>   ret void
> }
> attributes #512 = { nounwind "amdgpu-flat-work-group-size"="512,512" }
> 
> define amdgpu_kernel void @f1024() #1024 {
>   call void @foo()
>   call void @use256vgprs()
>   ret void
> }
> 
> attributes #1024 = { nounwind "amdgpu-flat-work-group-size"="1024,1024" }
> 
> declare void @foo()
> 
> ```
That sounds like a bug in SILowerSGPRSpills. It should not claim to preserve SlotIndexes if it does not preserve them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76861/new/

https://reviews.llvm.org/D76861



More information about the llvm-commits mailing list