[PATCH] D123835: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32

Tue Apr 26 03:06:47 PDT 2022

foad added a comment.

> Only fold for uniform values on pre-GFX9 chips.

This made my brain hurt. I //think// it might be clearer as "For uniform values, only fold on pre-GFX9 chips." English is really imprecise.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:10702
+    for (auto I = LHS->use_begin(), E = LHS->use_end(); I != E; ++I) {
+      if (I.getUse().getResNo() != 0)
+        continue;
----------------
arsenm wrote:
> I don't understand why you're checking this if you bail on not ISD:ADD. I guess it would make sense if you were handling the carry out adds in a separate patch?
LHS is the MUL here, not an ADD, so there's really no need to check ResNo.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:10748
+  SDValue Shift = DAG.getShiftAmountConstant(32, MVT::i64, SL);
+  SDValue AccumLo = DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, Accum);
+  SDValue AccumHi = DAG.getNode(ISD::SRL, SL, MVT::i64, Accum, Shift);
----------------
I don't know if it makes any practical difference, but other code like `AMDGPUTargetLowering::LowerUDIVREM64` uses EXTRACT_ELEMENT to split an i64 into a pair of i32s, and BITCAST(BUILD_VECTOR ...) to reassemble them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123835/new/

https://reviews.llvm.org/D123835