[llvm] SystemZ: Stop casting fp typed atomic loads in the IR (PR #90768)

Thu May 2 04:39:48 PDT 2024

================
@@ -6249,6 +6249,41 @@ SDValue SystemZTargetLowering::LowerOperation(SDValue Op,
   }
 }
 
+// Manually lower a bitcast to avoid introducing illegal types after type
+// legalization.
+static SDValue expandBitCastI128ToF128(SelectionDAG &DAG, SDValue Src,
+                                       SDValue Chain, const SDLoc &SL) {
+  MachineFunction &MF = DAG.getMachineFunction();
+  const DataLayout &DL = DAG.getDataLayout();
+
+  assert(DL.isBigEndian());
+
+  Align F128Align = DL.getPrefTypeAlign(Type::getFP128Ty(*DAG.getContext()));
+  SDValue StackTemp = DAG.CreateStackTemporary(MVT::f128, F128Align.value());
+  int FI = cast<FrameIndexSDNode>(StackTemp)->getIndex();
+  Align A = MF.getFrameInfo().getObjectAlign(FI);
+
+  SDValue Hi =
+      DAG.getTargetExtractSubreg(SystemZ::subreg_h64, SL, MVT::i64, Src);
+  SDValue Lo =
+      DAG.getTargetExtractSubreg(SystemZ::subreg_l64, SL, MVT::i64, Src);
+
----------------
uweigand wrote:

I don't see why we need to go via memory here.  This routine is only used when i128 is not legal, which means we have no vector registers and f128 lives in FP128.  So we have to move from GR128 (a pair of GR64) to FP128 (a pair of FP64), which means we can just use two LDGR to move the two 64-bit parts separately.

I think after splitting the GR128 into Hi and Lo as you did here, you should be able to just bitcast both parts to f64 (which will emit the LDGRs), and then join the two f64 parts via `REG_SEQUENCE` into a f128.

https://github.com/llvm/llvm-project/pull/90768