[PATCH] D158468: [AMDGPU] Support sdot4 / sdot8 intrinsics on gfx11
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 22 12:05:02 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/VOP3PInstructions.td:437-452
defm V_DOT4_I32_IU8 : VOP3PDOTIUInst<"v_dot4_i32_iu8", int_amdgcn_sudot4>;
defm V_DOT8_I32_IU4 : VOP3PDOTIUInst<"v_dot8_i32_iu4", int_amdgcn_sudot8>;
+
+def : GCNPat < (int_amdgcn_sdot8 i32:$src0,
+ i32:$src1,
+ i32:$src2, (i1 timm:$clamp)),
+ (V_DOT8_I32_IU4 (i32 8), i32:$src0,
----------------
jrbyrnes wrote:
> arsenm wrote:
> > I don't understand how these cases are different, the intrinsic name is just slightly different from the instruction name?
> On all other targets with 8bit and 4bit signed dot, we codegen for int_amdgcn_sdot4 and int_amdgcn_sdot8. However, we don't support these on gfx1100 -- instead, gfx100 has int_amdgcn_sUdot4 / int_amdgcn_sUdot8. The result is that users of these intrinsics must always check the target to use the corresponding one (sudot4 for gfx1100, and sdot4 for all others).
>
> This removes that responsibility from the user, so they are able to use sdot4 across all targets and generate the corresponding instructions.
>
Are there unit tests for these somewhere? I don't really know the full history of these instructions and I'm worried there was some random edge case behavior change
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158468/new/
https://reviews.llvm.org/D158468
More information about the llvm-commits
mailing list