[llvm] [AArch64] Make use of byte FPR stores for bytes extracted from vectors (PR #134117)

Paul Walker via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 3 09:11:00 PDT 2025


================
@@ -4589,6 +4589,18 @@ def : Pat<(truncstorei16 GPR64:$Rt, (am_unscaled16 GPR64sp:$Rn, simm9:$offset)),
 def : Pat<(truncstorei8 GPR64:$Rt, (am_unscaled8 GPR64sp:$Rn, simm9:$offset)),
   (STURBBi (EXTRACT_SUBREG GPR64:$Rt, sub_32), GPR64sp:$Rn, simm9:$offset)>;
 
+// v1i64 -> bsub truncating stores
+// Supporting pattern lower f32/64 -> v8i8
+def : Pat<(v8i8 (vector_insert (v8i8 (undef)), (i32 FPR32:$src), 0)),
+          (INSERT_SUBREG (v8i8 (IMPLICIT_DEF)), FPR32:$src, ssub)>;
+def : Pat<(v8i8 (vector_insert (v8i8 (undef)), (i64 FPR64:$src), 0)),
+          (v8i8 (EXTRACT_SUBREG (INSERT_SUBREG (v16i8 (IMPLICIT_DEF)), FPR64:$src, dsub), dsub))>;
+// Lower v1i64 -> v1i8 truncstore to bsub store
+def : Pat<(truncstorevi8 v1i64:$VT, (am_unscaled8 GPR64sp:$Rn, simm9:$offset)),
+          (STURBi (vi8 (EXTRACT_SUBREG v1i64:$VT, bsub)), GPR64sp:$Rn, simm9:$offset)>;
+def : Pat<(truncstorevi8 v1i64:$VT, (am_indexed8 GPR64sp:$Rn, uimm12s4:$offset)),
+          (STRBui (vi8 (EXTRACT_SUBREG v1i64:$VT, bsub)), GPR64sp:$Rn, uimm12s4:$offset)>;
----------------
paulwalker-arm wrote:

I hoped we could use something like `(i8 (COPY_TO_REGCLASS v1i64:$VT, FPR8))` but the downside to that is we then need to implement support to copy across the register classes within `copyPhysReg` which is generally frowned upon.  That alone might not be terminal but because the register classes do not match you do end up with redundant copies where the source and destination are the same.

https://github.com/llvm/llvm-project/pull/134117


More information about the llvm-commits mailing list