[all-commits] [llvm/llvm-project] d66d52: [X86][SSE] combineMulToPMADDWD - improve recogniti...
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Thu Sep 2 09:36:49 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: d66d520fe11c4298169e64515c853d805a3f7ab5
https://github.com/llvm/llvm-project/commit/d66d520fe11c4298169e64515c853d805a3f7ab5
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2021-09-02 (Thu, 02 Sep 2021)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/madd.ll
M llvm/test/CodeGen/X86/pmaddubsw.ll
M llvm/test/CodeGen/X86/shrink_vmul.ll
Log Message:
-----------
[X86][SSE] combineMulToPMADDWD - improve recognition of sign/zero extended upper bits
PMADDWD(v8i16 x, v8i16 y) == (v4i32) { (int)x[0]*y[0] + (int)x[1]*y[1], ..., (int)x[6]*y[6] + (int)x[7]*y[7] }
Currently combineMulToPMADDWD only folds cases where the upper 17 bits of both vXi32 inputs are known zero (i.e. the first half is positive and the second half of the pair is zero in each 2xi16 pair), this can be relaxed to only require one zero-extended input if the other input has at least 17 sign bits.
That way the sign of the result is still preserved, and the second half is still zero.
Noticed while investigating PR47437.
Differential Revision: https://reviews.llvm.org/D108522
More information about the All-commits
mailing list