[all-commits] [llvm/llvm-project] d66d52: [X86][SSE] combineMulToPMADDWD - improve recogniti...

Simon Pilgrim via All-commits all-commits at lists.llvm.org
Thu Sep 2 09:36:49 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: d66d520fe11c4298169e64515c853d805a3f7ab5
      https://github.com/llvm/llvm-project/commit/d66d520fe11c4298169e64515c853d805a3f7ab5
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2021-09-02 (Thu, 02 Sep 2021)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/madd.ll
    M llvm/test/CodeGen/X86/pmaddubsw.ll
    M llvm/test/CodeGen/X86/shrink_vmul.ll

  Log Message:
  -----------
  [X86][SSE] combineMulToPMADDWD - improve recognition of sign/zero extended upper bits

PMADDWD(v8i16 x, v8i16 y) == (v4i32) { (int)x[0]*y[0] + (int)x[1]*y[1], ..., (int)x[6]*y[6] + (int)x[7]*y[7] }

Currently combineMulToPMADDWD only folds cases where the upper 17 bits of both vXi32 inputs are known zero (i.e. the first half is positive and the second half of the pair is zero in each 2xi16 pair), this can be relaxed to only require one zero-extended input if the other input has at least 17 sign bits.

That way the sign of the result is still preserved, and the second half is still zero.

Noticed while investigating PR47437.

Differential Revision: https://reviews.llvm.org/D108522




More information about the All-commits mailing list