[llvm] 30fd35f - AMDGPU: Add some notes about amdgpu-flat-work-group-size
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 7 16:02:51 PDT 2023
Author: Matt Arsenault
Date: 2023-07-07T19:02:46-04:00
New Revision: 30fd35f59ceb4c00a550b82af767a5b9cf9e252d
URL: https://github.com/llvm/llvm-project/commit/30fd35f59ceb4c00a550b82af767a5b9cf9e252d
DIFF: https://github.com/llvm/llvm-project/commit/30fd35f59ceb4c00a550b82af767a5b9cf9e252d.diff
LOG: AMDGPU: Add some notes about amdgpu-flat-work-group-size
Added:
Modified:
llvm/docs/AMDGPUUsage.rst
Removed:
################################################################################
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 2fae09a1bced59..dbe6e69a3b3975 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -999,7 +999,12 @@ The AMDGPU backend supports the following LLVM IR attributes.
"amdgpu-flat-work-group-size"="min,max" Specify the minimum and maximum flat work group sizes that
will be specified when the kernel is dispatched. Generated
by the ``amdgpu_flat_work_group_size`` CLANG attribute [CLANG-ATTR]_.
- The implied default value is 1,1024.
+ The IR implied default value is 1,1024. Clang may emit this attribute
+ with more restrictive bounds depending on language defaults.
+ If the actual block or workgroup size exceeds the limit at any point during
+ the execution, the behavior is undefined. For example, even if there is
+ only one active thread but the thread local id exceeds the limit, the
+ behavior is undefined.
"amdgpu-implicitarg-num-bytes"="n" Number of kernel argument bytes to add to the kernel
argument block size for the implicit arguments. This
More information about the llvm-commits
mailing list