[all-commits] [llvm/llvm-project] 25c5da: AMDGPU Reduce reported maximum group size to 1024
Matt Arsenault via All-commits
all-commits at lists.llvm.org
Tue Nov 12 17:41:47 PST 2019
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 25c5da5a426168b38fb3e9baa918faa75e4a92b4
https://github.com/llvm/llvm-project/commit/25c5da5a426168b38fb3e9baa918faa75e4a92b4
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2019-11-13 (Wed, 13 Nov 2019)
Changed paths:
M llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
M llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-v3.ll
M llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size.ll
M llvm/test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll
Log Message:
-----------
AMDGPU Reduce reported maximum group size to 1024
While some targets allow encoding 2048, this was never tested or
supported.
Commit: 4b472139513ba460595804f8113497844b41fbcc
https://github.com/llvm/llvm-project/commit/4b472139513ba460595804f8113497844b41fbcc
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2019-11-13 (Wed, 13 Nov 2019)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
M llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll
M llvm/test/CodeGen/AMDGPU/array-ptr-calc-i32.ll
M llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props-v3.ll
M llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props.ll
M llvm/test/CodeGen/AMDGPU/lower-range-metadata-intrinsic-call.ll
M llvm/test/CodeGen/AMDGPU/occupancy-levels.ll
M llvm/test/CodeGen/AMDGPU/private-memory-r600.ll
M llvm/test/CodeGen/AMDGPU/promote-alloca-addrspacecast.ll
M llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-icmp.ll
M llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-phi.ll
M llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-select.ll
Log Message:
-----------
AMDGPU: Switch backend default max workgroup size to 1024
Previously this would default to 256, not the maximum supported size
of 1024. Using a maximum lower than the hardware maximum requires
language runtimes to enforce this limit for correctness, which no
language has correctly done. Switch the default to the conservatively
correct maximum, and force frontends to opt-in to the more optimal 256
default maximum.
I don't really understand why the changes in occupancy-levels.ll
increased the computed occupancy, which I expected to decrease. I'm
not sure if these tests should be forcing the old maximum.
Compare: https://github.com/llvm/llvm-project/compare/793b42a454ac...4b472139513b
More information about the All-commits
mailing list