[PATCH] D20931: [X86] Reduce the width of multiplification when its operands are extended from i8 or i16

Thu Jun 2 14:36:52 PDT 2016

wmi created this revision.
wmi added reviewers: hfinkel, RKSimon, congh.
wmi added subscribers: llvm-commits, davidxl, mkuper.
wmi set the repository for this revision to rL LLVM.

For <N x i32> type mul, pmuludq will be used for targets without SSE41, which often introduces many extra pack and unpack instructions in vectorized loop body because pmuludq generates <N/2 x i64> type value. However when the operands of <N x i32> mul are extended from smaller size values like i8 and i16, the type of mul may be shrinked to use pmullw + pmulhw/pmulhuw instead of pmuludq, which generates better code. For targets with SSE41, pmulld is supported so no shrinking is needed.

Repository:
  rL LLVM

http://reviews.llvm.org/D20931

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/shrink_vmul.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D20931.59458.patch
Type: text/x-patch
Size: 27761 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160602/4c6ef188/attachment-0001.bin>