[all-commits] [llvm/llvm-project] 00060a: [X86] Custom type legalize v2i32 smulo/umulo to us...

Craig Topper via All-commits all-commits at lists.llvm.org
Mon Jul 25 09:12:54 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 00060a7b9797f87a5baffef224637e2469189db8
      https://github.com/llvm/llvm-project/commit/00060a7b9797f87a5baffef224637e2469189db8
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2022-07-25 (Mon, 25 Jul 2022)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/vec_smulo.ll
    M llvm/test/CodeGen/X86/vec_umulo.ll

  Log Message:
  -----------
  [X86] Custom type legalize v2i32 smulo/umulo to use a single pmuldq/pmuludq.

With SSE4.1 and above we were using 3 multiply instructions. This
was due to type legalization widening to v4i32 and the low half
being done with pmulld while the high half used two pmuldq/pmuludq.

Instead of that, we can use a single pmuludq/pmuldq to calculate
the full product at once, extract the high and low bits and compare
to check for overflow.

I've restricted SMULO to sse4.1 to get pmuldq. We can probably
do a fixup to pmuludq on earlier targets, but that's for another day.

I was going through my git stash and found an early version of this patch
from a year or two ago so I went ahead and finished it.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130432




More information about the All-commits mailing list