[all-commits] [llvm/llvm-project] 65a576: [X86] Add tests for incorrectly optimizing out shu...

goldsteinn via All-commits all-commits at lists.llvm.org
Mon Oct 9 12:40:57 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 65a576e27be814bd23f39b31a8074e9850b0fe26
      https://github.com/llvm/llvm-project/commit/65a576e27be814bd23f39b31a8074e9850b0fe26
  Author: Noah Goldstein <goldstein.w.n at gmail.com>
  Date:   2023-10-09 (Mon, 09 Oct 2023)

  Changed paths:
    M llvm/test/CodeGen/X86/movmsk-cmp.ll

  Log Message:
  -----------
  [X86] Add tests for incorrectly optimizing out shuffle used in `movmsk`; PR67287


  Commit: 1684c65bc997a8ce0ecf96a493784fe39def75de
      https://github.com/llvm/llvm-project/commit/1684c65bc997a8ce0ecf96a493784fe39def75de
  Author: Noah Goldstein <goldstein.w.n at gmail.com>
  Date:   2023-10-09 (Mon, 09 Oct 2023)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/movmsk-cmp.ll

  Log Message:
  -----------
  [X86] Fix logic for optimizing movmsk(bitcast(shuffle(x))); PR67287

Prior logic would remove the shuffle iff all of the elements in `x`
where used. This is incorrect.

The issue is `movmsk` only cares about the highbits, so if the width
of the elements in `x` is smaller than the width of the elements
for the `movmsk`, then the shuffle, even if it preserves all the elements,
may change which ones are used by the highbits.

For example:
`movmsk64(bitcast(shuffle32(x, (1,0,3,2))))`

Even though the shuffle mask `(1,0,3,2)` preserves all the elements, it
flips which will be relevant to the `movmsk64` (x[1] and x[3]
before and x[0] and x[2] after).

The fix here, is to ensure that the shuffle mask can be scaled to the
element width of the `movmsk` instruction. This ensure that the
"high" elements stay "high". This is overly conservative as it
misses cases like `(1,1,3,3)` where the "high" elements stay
intact despite not be scalable, but for an relatively edge-case
optimization that should generally be handled during
simplifyDemandedBits, it seems okay.


Compare: https://github.com/llvm/llvm-project/compare/cbafb6f2f5c9...1684c65bc997


More information about the All-commits mailing list