[PATCH] D93963: [GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO

Pushpinder Singh via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 28 02:27:27 PST 2021


pdhaliwal added a comment.

Hi, apologies for late reply as I got sidetracked to some other work.



================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir:60
+    ; GFX8: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[UMULH1]](s32), [[C]]
+    ; GFX8: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[ICMP]](s1)
+    ; GFX8: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ICMP1]](s1)
----------------
foad wrote:
> Why do we get ANYEXT followed by AND with 1? In the scalar s32 case we just get ZEXT which is nicer.
This is coming from widening of BUILD_VECTOR resulting from legalization of ICMP instruction. So, I guess ZEXT is presumed dead once BUILD_VECTOR gets legalized.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir:115-118
+    ; GFX8: [[UADDO:%[0-9]+]]:_(s32), [[UADDO1:%[0-9]+]]:_(s1) = G_UADDO [[MUL]], [[MUL1]]
+    ; GFX8: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[UADDO1]](s1)
+    ; GFX8: [[UADDO2:%[0-9]+]]:_(s32), [[UADDO3:%[0-9]+]]:_(s1) = G_UADDO [[UADDO]], [[UMULH]]
+    ; GFX8: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UADDO3]](s1)
----------------
foad wrote:
> Is there a good reason why this is using UADDO plus ZEXT plus a second ADD, instead of using UADDE?
This is because of legalization of UMULH for s64. I am thinking of having different patch for this as it impacts narrowing of UMULH.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir:218
+    ; GFX8: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 0
+    ; GFX8: [[MUL1:%[0-9]+]]:_(s16) = G_MUL [[TRUNC]], [[TRUNC1]]
+    ; GFX8: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[TRUNC2]](s16), [[C2]]
----------------
foad wrote:
> This expansion does an s32 multiply and an s16 multiply. It would be better to just do one s32 multiply -- you can extract all the information you need from the result of that.
I was able to get unsigned operation use single s32 multiply. Signed is getting bit tricky.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93963/new/

https://reviews.llvm.org/D93963



More information about the llvm-commits mailing list