[PATCH] D87236: [X86][SSE2] Use smarter instruction patterns for lowering UMIN/UMAX with v8i16 and SMIN/SMAX with v16i8.

Tom Hender via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 7 07:24:19 PDT 2020


TomHender created this revision.
TomHender added reviewers: RKSimon, craig.topper.
TomHender added a project: LLVM.
Herald added subscribers: llvm-commits, hiraditya.
TomHender requested review of this revision.

This is my first LLVM patch, so please tell me if there are any process issues.

The main observation for this patch is that we can lower UMIN/UMAX with v8i16 by using unsigned saturated subtractions in a clever way. Previously this operation was lowered by turning the signbit of both inputs and the output which turns the unsigned minimum/maximum into a signed one.

Also we can use this trick in reverse for lowering SMIN/SMAX with v16i8 instead. I think in terms of latency/throughput this is the same as the generic lowering but it needs one less move instruction and the sign bit turning has an increased chance of being optimized further. This is particularly apparent in the "reduce" test cases.

Unfortunately this argument also applies in reverse to the new lowering of UMIN/UMAX with v8i16 which regresses the the "horizontal-reduce-umax", "horizontal-reduce-umin", "vector-reduce-umin" and "vector-reduce-umax" test cases a bit with this patch. Maybe some extra casework would be possible to avoid this. However independent of that I believe that the benefits in the common case of just 1 to 3 chained min/max instructions outweight the downsides in that specific case.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D87236

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/horizontal-reduce-smax.ll
  llvm/test/CodeGen/X86/horizontal-reduce-smin.ll
  llvm/test/CodeGen/X86/horizontal-reduce-umax.ll
  llvm/test/CodeGen/X86/horizontal-reduce-umin.ll
  llvm/test/CodeGen/X86/machine-combiner-int-vec.ll
  llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
  llvm/test/CodeGen/X86/midpoint-int-vec-128.ll
  llvm/test/CodeGen/X86/sat-add.ll
  llvm/test/CodeGen/X86/smax.ll
  llvm/test/CodeGen/X86/smin.ll
  llvm/test/CodeGen/X86/umax.ll
  llvm/test/CodeGen/X86/umin.ll
  llvm/test/CodeGen/X86/vec_minmax_sint.ll
  llvm/test/CodeGen/X86/vec_minmax_uint.ll
  llvm/test/CodeGen/X86/vector-reduce-smax.ll
  llvm/test/CodeGen/X86/vector-reduce-smin.ll
  llvm/test/CodeGen/X86/vector-reduce-umax.ll
  llvm/test/CodeGen/X86/vector-reduce-umin.ll
  llvm/test/CodeGen/X86/vector-trunc-usat.ll
  llvm/test/CodeGen/X86/vselect-minmax.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D87236.290271.patch
Type: text/x-patch
Size: 217780 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200907/af97c8cb/attachment-0001.bin>


More information about the llvm-commits mailing list