[PATCH] D141473: [PowerPC] Simplify fp-to-int store optimization

ChenZheng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 18 01:27:26 PDT 2023


shchenz added a comment.

> Yes, one less xscvdpsxws.

Good to hear that. This patch looks right to me. Some more comments



================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:8128
+  if ((DestTy == MVT::i8 || DestTy == MVT::i16) && Subtarget.hasP9Vector())
+    DestTy = MVT::i32;
   unsigned Opc = ISD::DELETED_NODE;
----------------
Should we use `MVT::i64` for `DestTy` if the target is 64-bit? With this, we should still have accurate round result if the input float is between `2^31` and  `2^63 -1` before truncating.


================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:8075
+  MVT DestTy = Op.getSimpleValueType();
+  if ((DestTy == MVT::i8 || DestTy == MVT::i16) && Subtarget.hasP9Vector())
+    DestTy = MVT::i32;
----------------
qiucf wrote:
> shchenz wrote:
> > This change is not enough I think. You extend this function to be used before type legalizer and make this function depend on the caller that will not pass other illegal types except `i8` and `i16`. How about simple types like `i2` and `i4`?
> > 
> > We need a guard for the type in this function.
> > 
> `combineStoreFPToInt` already rejected illegal types. And codegen for `i2` `i4` are good:
> 
> ```
> xscvdpsxws 0, 1
> mffprwz	3, 0
> clrlwi	3, 3, 28
> stb 3, 0(4)
> stb 3, 0(5)
> stb 3, 0(6)
> blr
> ```
OK. But if we assume caller will only pass certain types to this function, we may need to add an assertion here? Otherwise when there is a new caller for this function in future, it is not so obvious that this function has assumption for the types. For example, if we pass `i2` or `i4` or even pass `i8` or `i16` on a pwr8 target, we get `llvm_unreachable("Unhandled FP_TO_INT type in custom expander!");`?


================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:14878
-                    dl, ResVT == MVT::f128 ? MVT::f128 : MVT::f64, Val);
-  DCI.AddToWorklist(Val.getNode());
 
----------------
qiucf wrote:
> qiucf wrote:
> > shchenz wrote:
> > > We may need to following this logic and consider about adding the users back to the work list.
> > Is that necessary? DAG combiner will add the users into worklist.
> The root node to be returned is the `ST_VSR_SCAL_INT`, then `FCTIDZ`, then `FP_EXTEND` (if f32). The `FP_EXTEND` from f32 to f64 will disappear after ISel. The others are PPC-specific nodes so adding them into worklist manually will not help.
OK, fair. Thanks for checking.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrP10.td:1258
+  def : Pat<(PPCstore_scal_int_from_vsr f128:$src, (PPCmatpcreladdr PCRelForm:$dst), 8),
+            (PSTXSDpc (COPY_TO_REGCLASS $src, VSRC), $dst, 0)>;
 
----------------
Can you explain why we need to copy to regclass `VSRC`? The definition of `PSTXSDpc` seems requiring RC to be `VFRC` for `$src`.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:3161
+def : Pat<(PPCstore_scal_int_from_vsr f128:$src, ForceXForm:$dst, 8),
+          (STXSDX (COPY_TO_REGCLASS $src, VSRC), ForceXForm:$dst)>;
 
----------------
Compared with the lines at 4046, we seems missing some pattern here? This is for `[HasVSX, NoP9Vector]`.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:3311
+def : Pat<(PPCstore_scal_int_from_vsr f128:$src, ForceXForm:$dst, 4),
+          (STIWX (COPY_TO_REGCLASS $src, VSRC), ForceXForm:$dst)>;
 
----------------
This is for `[HasVSX, HasP8Vector]`


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:4046
 // The 8-byte version is repeated here due to availability of D-Form STXSD.
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), XForm:$dst, 8),
-          (STXSDX (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),
-                  XForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), DSForm:$dst, 8),
-          (STXSD (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),
-                 DSForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), ForceXForm:$dst, 4),
-          (STXSIWX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), ForceXForm:$dst, 2),
-          (STXSIHX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), ForceXForm:$dst, 1),
-          (STXSIBX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), XForm:$dst, 8),
-          (STXSDX (XSCVDPSXDS f64:$src), XForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), DSForm:$dst, 8),
-          (STXSD (XSCVDPSXDS f64:$src), DSForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), ForceXForm:$dst, 2),
-          (STXSIHX (XSCVDPSXWS f64:$src), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), ForceXForm:$dst, 1),
-          (STXSIBX (XSCVDPSXWS f64:$src), ForceXForm:$dst)>;
-
-// Instructions for store(fptoui).
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), XForm:$dst, 8),
-          (STXSDX (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),
-                  XForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), DSForm:$dst, 8),
-          (STXSD (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),
-                 DSForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), ForceXForm:$dst, 4),
-          (STXSIWX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), ForceXForm:$dst, 2),
-          (STXSIHX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), ForceXForm:$dst, 1),
-          (STXSIBX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), XForm:$dst, 8),
-          (STXSDX (XSCVDPUXDS f64:$src), XForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), DSForm:$dst, 8),
-          (STXSD (XSCVDPUXDS f64:$src), DSForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), ForceXForm:$dst, 2),
-          (STXSIHX (XSCVDPUXWS f64:$src), ForceXForm:$dst)>;
-def : Pat<(PPCstore_scal_int_from_vsr
-            (f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), ForceXForm:$dst, 1),
-          (STXSIBX (XSCVDPUXWS f64:$src), ForceXForm:$dst)>;
+def : Pat<(PPCstore_scal_int_from_vsr f64:$src, XForm:$dst, 8),
+          (STXSDX $src, XForm:$dst)>;
----------------
This is for `HasVSX, HasP9Vector`.

Except the patterns for `STXSIHX` and `STXSIBX`(Power9 specific), all other patterns seems common for these three predicates: `[HasVSX, HasP9Vector]`, `[HasVSX, HasP8Vector]`, `[HasVSX, NoP9Vector]`. Can we try to refactor these patterns a little bit, there seems some redundancy with current implementation?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141473/new/

https://reviews.llvm.org/D141473



More information about the llvm-commits mailing list