[llvm] [AMDGPU][True16][MC][CodeGen] true16 for v_cndmask_b16 (PR #119736)
Joe Nash via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 14 11:13:37 PST 2025
================
@@ -1245,11 +1245,29 @@ class VOPSelectPat <ValueType vt> : GCNPat <
(vt (select i1:$src0, vt:$src1, vt:$src2)),
(V_CNDMASK_B32_e64 0, VSrc_b32:$src2, 0, VSrc_b32:$src1, SSrc_i1:$src0)
>;
+class VOPSelectPat_t16 <ValueType vt> : GCNPat <
+ (vt (select i1:$src0, vt:$src1, vt:$src2)),
+ (V_CNDMASK_B16_t16_e64 0, VSrcT_b16:$src2, 0, VSrcT_b16:$src1, SSrc_i1:$src0)
+>;
+class VOPSelectPat_fake16 <ValueType vt> : GCNPat <
+ (vt (select i1:$src0, vt:$src1, vt:$src2)),
+ (V_CNDMASK_B16_fake16_e64 0, VSrc_b16:$src2, 0, VSrc_b16:$src1, SSrc_i1:$src0)
+>;
def : VOPSelectModsPat <i32>;
def : VOPSelectModsPat <f32>;
-def : VOPSelectPat <f16>;
-def : VOPSelectPat <i16>;
+let True16Predicate = NotHasTrue16BitInsts in {
+ def : VOPSelectPat <f16>;
+ def : VOPSelectPat <i16>;
+} // End True16Predicate = NotHasTrue16BitInsts
+let True16Predicate = UseRealTrue16Insts in {
+ def : VOPSelectPat_t16 <f16>;
+ def : VOPSelectPat_t16 <i16>;
+} // End True16Predicate = UseRealTrue16Insts
+let True16Predicate = UseFakeTrue16Insts in {
+ def : VOPSelectPat_fake16 <f16>;
+ def : VOPSelectPat_fake16 <i16>;
----------------
Sisyph wrote:
I think we might want to continue using cndmask_b32 for Fake16. The reason being that we have VOPD cndmask_b32, but no VOPD cndmask_b16. So there is in fact a performance penalty, and no register saving upside that we have in real true16 mode.
https://github.com/llvm/llvm-project/pull/119736
More information about the llvm-commits
mailing list