[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)
Vikram Hegde via llvm-commits
llvm-commits at lists.llvm.org
Thu May 16 04:35:34 PDT 2024
================
@@ -243,11 +243,16 @@ def VOP_READFIRSTLANE : VOPProfile <[i32, i32, untyped, untyped]> {
// FIXME: Specify SchedRW for READFIRSTLANE_B32
// TODO: There is VOP3 encoding also
def V_READFIRSTLANE_B32 : VOP1_Pseudo <"v_readfirstlane_b32", VOP_READFIRSTLANE,
- getVOP1Pat<int_amdgcn_readfirstlane,
- VOP_READFIRSTLANE>.ret, 1> {
+ [], 1> {
let isConvergent = 1;
}
+foreach vt = Reg32Types.types in {
+ def : GCNPat<(vt (AMDGPUreadfirstlane (vt VRegOrLdsSrc_32:$src0))),
+ (V_READFIRSTLANE_B32 (vt VRegOrLdsSrc_32:$src0))
----------------
vikramRH wrote:
Attaching example match table snippets for v2i16 and p3 here, should make the scenario bit more clear,
for v2i16
```
GIM_Try, /*On fail goto*//*Label 3499*/ GIMT_Encode4(202699), // Rule ID 2117 //
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode2(Intrinsic::amdgcn_writelane),
GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_v2s16,
GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_v2s16,
GIM_RootCheckType, /*Op*/3, /*Type*/GILLT_s32,
GIM_RootCheckType, /*Op*/4, /*Type*/GILLT_v2s16,
GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(AMDGPU::VGPR_32RegClassID),
// (intrinsic_wo_chain:{ *:[v2i16] } 2863:{ *:[iPTR] }, v2i16:{ *:[v2i16] }:$src0, i32:{ *:[i32] }:$src1, v2i16:{ *:[v2i16] }:$src2) => (V_WRITELANE_B32:{ *:[v2i16] } SCSrc_b32:{ *:[v2i16] }:$src0, SCSrc_b32:{ *:[i32] }:$src1, VGPR_32:{ *:[v2i16] }:$src2)
GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(AMDGPU::V_WRITELANE_B32),
```
and for p3,
```
GIM_Try, /*On fail goto*//*Label 3502*/ GIMT_Encode4(202816), // Rule ID 2129 //
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode2(Intrinsic::amdgcn_writelane),
GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32,
GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_p2s32,
GIM_RootCheckType, /*Op*/3, /*Type*/GILLT_s32,
GIM_RootCheckType, /*Op*/4, /*Type*/GILLT_p2s32,
GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(AMDGPU::VGPR_32RegClassID),
// (intrinsic_wo_chain:{ *:[i32] } 2863:{ *:[iPTR] }, p2:{ *:[i32] }:$src0, i32:{ *:[i32] }:$src1, p2:{ *:[i32] }:$src2) => (V_WRITELANE_B32:{ *:[i32] } SCSrc_b32:{ *:[i32] }:$src0, SCSrc_b32:{ *:[i32] }:$src1, VGPR_32:{ *:[i32] }:$src2)
GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(AMDGPU::V_WRITELANE_B32),
```
The destination type check for p3 case is still for "GILLT_s32",
https://github.com/llvm/llvm-project/pull/89217
More information about the llvm-commits
mailing list