[PATCH] D93963: [GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 15 02:15:24 PST 2021
foad added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir:60
+ ; GFX8: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[UMULH1]](s32), [[C]]
+ ; GFX8: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[ICMP]](s1)
+ ; GFX8: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ICMP1]](s1)
----------------
Why do we get ANYEXT followed by AND with 1? In the scalar s32 case we just get ZEXT which is nicer.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir:115-118
+ ; GFX8: [[UADDO:%[0-9]+]]:_(s32), [[UADDO1:%[0-9]+]]:_(s1) = G_UADDO [[MUL]], [[MUL1]]
+ ; GFX8: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[UADDO1]](s1)
+ ; GFX8: [[UADDO2:%[0-9]+]]:_(s32), [[UADDO3:%[0-9]+]]:_(s1) = G_UADDO [[UADDO]], [[UMULH]]
+ ; GFX8: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UADDO3]](s1)
----------------
Is there a good reason why this is using UADDO plus ZEXT plus a second ADD, instead of using UADDE?
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir:218
+ ; GFX8: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 0
+ ; GFX8: [[MUL1:%[0-9]+]]:_(s16) = G_MUL [[TRUNC]], [[TRUNC1]]
+ ; GFX8: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[TRUNC2]](s16), [[C2]]
----------------
This expansion does an s32 multiply and an s16 multiply. It would be better to just do one s32 multiply -- you can extract all the information you need from the result of that.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93963/new/
https://reviews.llvm.org/D93963
More information about the llvm-commits
mailing list