[PATCH] D69654: AMDGPU: Switch backend default max workgroup size to 1024
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 31 10:57:49 PDT 2019
rampitec added a comment.
I do not think that deliberately introducing performance regression is a good way to force FE to do anything.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:354
default:
- return std::make_pair(1, 16 * getWavefrontSize());
+ return std::make_pair(1u, getMaxFlatWorkGroupSize());
}
----------------
It currently returns 2048, not 1024 as far as I can see.
================
Comment at: llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props-v3.ll:15
; CHECK: .kernarg_segment_size: 24
-; CHECK: .max_flat_workgroup_size: 256
+; CHECK: .max_flat_workgroup_size: 1024
; CHECK: .name: test
----------------
And given that getMaxFlatWorkGroupSize() returns 2048 I do not understand how does it work.
================
Comment at: llvm/test/CodeGen/AMDGPU/occupancy-levels.ll:265
; GCN-LABEL: {{^}}used_lds_6556:
-; GFX9: ; Occupancy: 9
-; GFX1010W64: ; Occupancy: 19
+; GFX9: ; Occupancy: 10
+; GFX1010W64: ; Occupancy: 20
----------------
This needs to be investigated first I believe. There must be some wrong logic somewhere with LDS accounting for occupancy.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69654/new/
https://reviews.llvm.org/D69654
More information about the llvm-commits
mailing list