[PATCH] D108522: [X86][SSE] combineMulToPMADDWD - improve recognition of sign/zero extended upper bits

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 22 13:15:00 PDT 2021


RKSimon created this revision.
RKSimon added reviewers: pengfei, spatel, lebedev.ri, craig.topper.
Herald added subscribers: hiraditya, inglorion.
RKSimon requested review of this revision.
Herald added a project: LLVM.

PMADDWD(v8i16 x, v8i16 y) == (v4i32) { (int)x[0]*y[0] + (int)x[1]*y[1], ..., (int)x[6]*y[6] + (int)x[7]*y[7] }

Currently combineMulToPMADDWD only folds cases where the upper 17 bits of both vXi32 inputs are known zero (i.e. the first half is positive and the second half of the pair is zero in each 2xi16 pair), this can be relaxed to only require one zero-extended input if the other input has at least 17 sign bits.

That way the sign of the result is still preserved, and the second half is still zero.

Noticed while investigating PR47437.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108522

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/madd.ll
  llvm/test/CodeGen/X86/pmaddubsw.ll
  llvm/test/CodeGen/X86/shrink_vmul.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D108522.367997.patch
Type: text/x-patch
Size: 16415 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210822/3844c5f7/attachment.bin>


More information about the llvm-commits mailing list