[lld] [flang] [llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
Tony Tye via cfe-commits
cfe-commits at lists.llvm.org
Wed Jan 17 20:21:39 PST 2024
================
@@ -520,6 +520,106 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following
=========== =============== ============ ===== ================= =============== =============== ======================
+Generic processors also exist. They group multiple processors into one,
+allowing to build code once and run it on multiple targets at the cost
+of less features being available.
+
+Generic processors are only available on Code Object V6 and up.
+
+ .. table:: AMDGPU Generic Processors
+ :name: amdgpu-generic-processor-table
+
+ ==================== ============== ================= =================================
+ Processor Target Supported Target
+ Triple Processors Features
+ Architecture Restrictions
+
+
+
+
+
+
+
+
+ ==================== ============== ================= =================================
+ ``gfx9-generic`` ``amdgcn`` - ``gfx900`` - ``v_mad_mix`` instructions
+ - ``gfx902`` are not available on
+ - ``gfx904`` ``gfx900``, ``gfx902``,
+ - ``gfx906`` ``gfx909``, ``gfx90c``
+ - ``gfx909`` - ``v_fma_mix`` instructions
+ - ``gfx90c`` are not available on ``gfx904``
+ - sramecc is not available on
+ ``gfx906``
+ - The following instructions
+ are not available on ``gfx906``:
+
+ - ``v_fmac_f32``
+ - ``v_xnor_b32``
+ - ``v_dot4_i32_i8``
+ - ``v_dot8_i32_i4``
+ - ``v_dot2_i32_i16``
+ - ``v_dot2_u32_u16``
+ - ``v_dot4_u32_u8``
+ - ``v_dot8_u32_u4``
+ - ``v_dot2_f32_f16``
+
+
+ ``gfx10.1-generic`` ``amdgcn`` - ``gfx1010`` - The following instructions are
+ - ``gfx1011`` not available on ``gfx1011``
+ - ``gfx1012`` and ``gfx1012``
+ - ``gfx1013``
+ - ``v_dot4_i32_i8``
+ - ``v_dot8_i32_i4``
+ - ``v_dot2_i32_i16``
+ - ``v_dot2_u32_u16``
+ - ``v_dot2c_f32_f16``
+ - ``v_dot4c_i32_i8``
+ - ``v_dot4_u32_u8``
+ - ``v_dot8_u32_u4``
+ - ``v_dot2_f32_f16``
+
+ - BVH Ray Tracing instructions
+ are not available on
+ ``gfx1013``
+
+
+ ``gfx10.3-generic`` ``amdgcn`` - ``gfx1030`` No restrictions.
+ - ``gfx1031``
+ - ``gfx1032``
+ - ``gfx1033``
+ - ``gfx1034``
+ - ``gfx1035``
+ - ``gfx1036``
+
+
+ ``gfx11-generic`` ``amdgcn`` - ``gfx1100`` Various codegen pessimizations
+ - ``gfx1101`` are applied to all targets to
+ - ``gfx1102`` work around hardware bugs on one
----------------
t-tye wrote:
I do not think we should be stating hardware bugs exist in public documentation. We can simply say less efficient code sequences are generated in various cases. Not sure we should list them.
Do we use msaa-load-dst-sel-bug, valu-trans-use-hazard, user-sgpr-init16-bug elsewhere in the code? Not sure we should be using the work "bug". Better to say "hazard".
https://github.com/llvm/llvm-project/pull/76955
More information about the cfe-commits
mailing list