[PATCH] D85653: [GlobalISel][AMDGPU] Lower G_SMULH/G_UMULH
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 18 09:40:40 PDT 2020
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:597
+ Mulh.clampMaxNumElements(0, S8, 2)
+ .fewerElementsIf(elementTypeIsNot(0, S8), scalarize(0));
+ } else {
----------------
pdhaliwal wrote:
> arsenm wrote:
> > This should be unnecessary
> If I drop this, the <2 x s32> case starts generating worse code. This is due to lowering coming into the picture which promotes the 32-bit mulh to 64-bit mul and then legalizing 64-bit mul. I can use VOP3P instruction only for S8. For others, I need to specify the scalarization.
This should be an unconditional scalarize. The scalarization shouldn't cause a 64-bit multiply to be used
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D85653/new/
https://reviews.llvm.org/D85653
More information about the llvm-commits
mailing list