[llvm] [AMDGPU][True16][GlobalISel] Fix v2*16 build_vector patterns (PR #151496)

Brox Chen via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 31 08:40:01 PDT 2025


================
@@ -3543,15 +3543,29 @@ def : GCNPat <
   (vecTy (UniformBinFrag<build_vector> (Ty undef), (Ty SReg_32:$src1))),
   (S_LSHL_B32 SReg_32:$src1, (i32 16))
 >;
-}
 
 def : GCNPat <
   (vecTy (DivergentBinFrag<build_vector> (Ty undef), (Ty VGPR_32:$src1))),
   (vecTy (V_LSHLREV_B32_e64 (i32 16), VGPR_32:$src1))
 >;
-} // End foreach Ty = ...
 }
 
+let True16Predicate = UseRealTrue16Insts in
+def : GCNPat <
+  (vecTy (DivergentBinFrag<build_vector> (Ty undef), (Ty VGPR_32:$src1))),
+  (REG_SEQUENCE VGPR_32, (Ty (IMPLICIT_DEF)), lo16, (Ty VGPR_32:$src1), hi16)
----------------
broxigarchen wrote:

oh I just saw that Ty is in loop i16/f16/bf16.  In that case I think this V_LSHLREV_B32_e64 pattern maybe just needed for non-true16 mode. 

Can you try removing this and see if ISel can get it right? Otherwise maybe try replacing with this
(REG_SEQUENCE VGPR_32, (Ty (IMPLICIT_DEF)), lo16, (Ty VGPR_16:$src1), hi16)

>Do you have an example of this?

We should not take16bit input into a vgpr32 in true16 mode. The 16bit value could be the output from a true16 instruction and thus isel will insert a "vgpr32 = COPY vgpr16" to match the register class which is illegal.


https://github.com/llvm/llvm-project/pull/151496


More information about the llvm-commits mailing list