[PATCH] D81430: [AMDGPU] Custom lowering of i64 umulo/smulo

Mon Jun 8 15:33:36 PDT 2020

arsenm added a comment.

Can you also add cases with power of 2 constants that the default expansion handles? I assume we miss out on these as-is?

  // mulo(X, 1 << S) -> { X << S, (X << S) >> S != X }

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:5008
+                            SL, VT, LHS, RHS);
+
+  SDValue Sign = isSigned
----------------
I assume this is extracted from the default expansion?

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:5011
+    ? DAG.getNode(ISD::SRA, SL, VT, Result,
+                  DAG.getConstant(VT.getScalarSizeInBits() - 1, SL, MVT::i64))
+    : DAG.getConstant(0, SL, VT);
----------------
Shift amount should be i32

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.mulo.ll:4
+
+define { i64, i1 } @umulo_i64(i64 %x, i64 %y) {
+; GCN-LABEL: umulo_i64:
----------------
Can you also add a pair that stress the scalar path and add a gfx9 run line

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81430/new/

https://reviews.llvm.org/D81430