[all-commits] [llvm/llvm-project] 00060a: [X86] Custom type legalize v2i32 smulo/umulo to us...
Craig Topper via All-commits
all-commits at lists.llvm.org
Mon Jul 25 09:12:54 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 00060a7b9797f87a5baffef224637e2469189db8
https://github.com/llvm/llvm-project/commit/00060a7b9797f87a5baffef224637e2469189db8
Author: Craig Topper <craig.topper at sifive.com>
Date: 2022-07-25 (Mon, 25 Jul 2022)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/vec_smulo.ll
M llvm/test/CodeGen/X86/vec_umulo.ll
Log Message:
-----------
[X86] Custom type legalize v2i32 smulo/umulo to use a single pmuldq/pmuludq.
With SSE4.1 and above we were using 3 multiply instructions. This
was due to type legalization widening to v4i32 and the low half
being done with pmulld while the high half used two pmuldq/pmuludq.
Instead of that, we can use a single pmuludq/pmuldq to calculate
the full product at once, extract the high and low bits and compare
to check for overflow.
I've restricted SMULO to sse4.1 to get pmuldq. We can probably
do a fixup to pmuludq on earlier targets, but that's for another day.
I was going through my git stash and found an early version of this patch
from a year or two ago so I went ahead and finished it.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D130432
More information about the All-commits
mailing list