[all-commits] [llvm/llvm-project] e28376: [X86] Add i32->float and i64->double bitcast pseud...

Mon Oct 19 12:57:57 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: e28376ec28b9034a35e01c95ccb4de9ccc6c4954
      https://github.com/llvm/llvm-project/commit/e28376ec28b9034a35e01c95ccb4de9ccc6c4954
  Author: Craig Topper <craig.topper at gmail.com>
  Date:   2020-10-19 (Mon, 19 Oct 2020)

  Changed paths:
    M llvm/lib/Target/X86/X86InstrFoldTables.cpp
    M llvm/lib/Target/X86/X86InstrInfo.cpp
    M llvm/test/CodeGen/X86/pr47874.ll

  Log Message:
  -----------
  [X86] Add i32->float and i64->double bitcast pseudo instructions to store folding table.

We have pseudo instructions we use for bitcasts between these types.
We have them in the load folding table, but not the store folding
table. This adds them there so they can be used for stack spills.

I added an exact size check so that we don't fold when the stack slot
is larger than the GPR. Otherwise the upper bits in the stack slot
would be garbage. That would be fine for Eli's test case in PR47874,
but I'm not sure its safe in general.

A step towards fixing PR47874. Next steps are to change the ADDSSrr_Int
pseudo instructions to use FR32 as the second source register class
instead of VR128. That will keep the coalescer from promoting the
register class of the bitcast instruction which will make the stack
slot 4 bytes instead of 16 bytes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D89656

  Commit: edd0cb11bd182de8d70b7bbeba73f88d7a3714db
      https://github.com/llvm/llvm-project/commit/edd0cb11bd182de8d70b7bbeba73f88d7a3714db
  Author: Craig Topper <craig.topper at gmail.com>
  Date:   2020-10-19 (Mon, 19 Oct 2020)

  Changed paths:
    M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
    M llvm/test/CodeGen/X86/vector-popcnt-128.ll
    M llvm/test/CodeGen/X86/vector-popcnt-256.ll
    M llvm/test/CodeGen/X86/vector-popcnt-512.ll

  Log Message:
  -----------
  [SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats

This enables these transforms for vectors:
(ctpop x) u< 2 -> (x & x-1) == 0
(ctpop x) u> 1 -> (x & x-1) != 0
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)

All enabled if CTPOP isn't Legal. This differs from the scalar
behavior where the first two are done unconditionally and the
last two are done if CTPOP isn't Legal or Custom. The Legal
check produced better results for vectors based on X86's
custom handling. Might be worth re-visiting scalars here.

I disabled the looking through truncate for vectors. The
code that creates new setcc can use the same result VT as the
original setcc even if we truncated the input. That may work
work for most scalars, but definitely wouldn't work for vectors
unless it was a vector of i1.

Fixes or at least improves PR47825

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D89346

Compare: https://github.com/llvm/llvm-project/compare/ae3625d7526f...edd0cb11bd18