[PATCH] D85653: [GlobalISel][AMDGPU] Lower G_SMULH/G_UMULH

Fri Sep 18 09:40:40 PDT 2020

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:597
+    Mulh.clampMaxNumElements(0, S8, 2)
+      .fewerElementsIf(elementTypeIsNot(0, S8), scalarize(0));
+  } else {
----------------
pdhaliwal wrote:
> arsenm wrote:
> > This should be unnecessary
> If I drop this, the <2 x s32> case starts generating worse code. This is due to lowering coming into the picture which promotes the 32-bit mulh to 64-bit mul and then legalizing 64-bit mul. I can use VOP3P instruction only for S8. For others, I need to specify the scalarization.
This should be an unconditional scalarize. The scalarization shouldn't cause a 64-bit multiply to be used

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85653/new/

https://reviews.llvm.org/D85653