[PATCH] D130432: [X86] Custom type legalize v2i32 smulo/umulo to use a single pmuldq/pmuludq.

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jul 23 16:21:28 PDT 2022


craig.topper created this revision.
craig.topper added reviewers: RKSimon, spatel.
Herald added subscribers: jsji, StephenFan, pengfei, hiraditya.
Herald added a project: All.
craig.topper requested review of this revision.
Herald added a project: LLVM.

With SSE4.1 and above we were using 3 multiply instructions. This
was due to type legalization widening to v4i32 and the low half
being done with pmulld while the high half used two pmuldq/pmuludq.

Instead of that, we can use a single pmuludq/pmuldq to calculate
the full product at once, extract the high and low bits and compare
to check for overflow.

I've restricted SMULO to sse4.1 to get pmuldq. We can probably
do a fixup to pmuludq on earlier targets, but that's for another day.

I was going through my git stash and found an early version of this patch
from a year or two ago so I went ahead and finished it.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D130432

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/vec_smulo.ll
  llvm/test/CodeGen/X86/vec_umulo.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D130432.447100.patch
Type: text/x-patch
Size: 14318 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220723/d3758b67/attachment.bin>


More information about the llvm-commits mailing list