[llvm] [AMDGPU][True16][CodeGen] srl pattern for true16 mode (PR #132987)
Brox Chen via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 31 07:14:15 PDT 2025
================
@@ -2425,6 +2425,13 @@ def : GCNPat <(i1 imm:$imm),
let WaveSizePredicate = isWave32;
}
+let True16Predicate = UseRealTrue16Insts in
+foreach vt = [i32, v2i16] in
+def : GCNPat <
+ (vt (DivergentBinFrag<srl> VGPR_32:$src, (i32 16))),
+ (REG_SEQUENCE VGPR_32, (i16 (EXTRACT_SUBREG $src, hi16)), lo16, (V_MOV_B16_t16_e64 0, (i16 0x0000), 0), hi16)
----------------
broxigarchen wrote:
Hi Matt. I think we are talking about two issues here.
The first thing is the 16-bit vs. 32-bit uniform problem, that we can teach the SrlCombine and the promoteUniformOp to use 16bit. Eventually they get lower to smaller size SRL, and finally get lowered to LSHRREV.
Another thing is the what this patch is trying to do. For VGPR 16bit right shift, we do not lower to LSHRREV, but use REG_SEQUENCE with .l/.h access. This depends on register type since we don't have sgpr 16 support yet
https://github.com/llvm/llvm-project/pull/132987
More information about the llvm-commits
mailing list