[PATCH] D108522: [X86][SSE] combineMulToPMADDWD - improve recognition of sign/zero extended upper bits

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 1 08:35:56 PDT 2021


RKSimon added a comment.

In D108522#2976709 <https://reviews.llvm.org/D108522#2976709>, @pengfei wrote:

> The math looks good to me.
> Wild thought: can we extend to zero/signed bits = 16 if the other element has more than 17 bits zero/signed? I think this should be common as a sext/zext <2 x i16> to <2 x i32>.

Thanks - we might be able to do something along those lines. I'll do some more testing to see what would be safe. The implicit sign extension of PMADDWD is quite powerful if we use it properly.



================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:51613
 
+// Simplify VPMADDWD operations.
+static SDValue combineVPMADDWD(SDNode *N, SelectionDAG &DAG,
----------------
pengfei wrote:
> Is there a test to cover this combine?
Yes, without it we get a couple of test regressions as the patch more aggressively generates PMADDWD custom nodes before other nodes further up the DAG have been simplified to zero.

I'll see if I can pull this out as a pre-commit. I left this in such as general state as I wondered about supporting PMADDUBSW as well with the same combine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108522/new/

https://reviews.llvm.org/D108522



More information about the llvm-commits mailing list