[PATCH] D123835: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32

Tue Apr 26 06:25:16 PDT 2022

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:10748
+  SDValue Shift = DAG.getShiftAmountConstant(32, MVT::i64, SL);
+  SDValue AccumLo = DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, Accum);
+  SDValue AccumHi = DAG.getNode(ISD::SRL, SL, MVT::i64, Accum, Shift);
----------------
foad wrote:
> I don't know if it makes any practical difference, but other code like `AMDGPUTargetLowering::LowerUDIVREM64` uses EXTRACT_ELEMENT to split an i64 into a pair of i32s, and BITCAST(BUILD_VECTOR ...) to reassemble them.
Using the shift adds extra steps. The combine on 64 bit shifts will turn this into the vector build 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123835/new/

https://reviews.llvm.org/D123835