[all-commits] [llvm/llvm-project] e28376: [X86] Add i32->float and i64->double bitcast pseud...
topperc via All-commits
all-commits at lists.llvm.org
Mon Oct 19 12:57:57 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: e28376ec28b9034a35e01c95ccb4de9ccc6c4954
https://github.com/llvm/llvm-project/commit/e28376ec28b9034a35e01c95ccb4de9ccc6c4954
Author: Craig Topper <craig.topper at gmail.com>
Date: 2020-10-19 (Mon, 19 Oct 2020)
Changed paths:
M llvm/lib/Target/X86/X86InstrFoldTables.cpp
M llvm/lib/Target/X86/X86InstrInfo.cpp
M llvm/test/CodeGen/X86/pr47874.ll
Log Message:
-----------
[X86] Add i32->float and i64->double bitcast pseudo instructions to store folding table.
We have pseudo instructions we use for bitcasts between these types.
We have them in the load folding table, but not the store folding
table. This adds them there so they can be used for stack spills.
I added an exact size check so that we don't fold when the stack slot
is larger than the GPR. Otherwise the upper bits in the stack slot
would be garbage. That would be fine for Eli's test case in PR47874,
but I'm not sure its safe in general.
A step towards fixing PR47874. Next steps are to change the ADDSSrr_Int
pseudo instructions to use FR32 as the second source register class
instead of VR128. That will keep the coalescer from promoting the
register class of the bitcast instruction which will make the stack
slot 4 bytes instead of 16 bytes.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D89656
Commit: edd0cb11bd182de8d70b7bbeba73f88d7a3714db
https://github.com/llvm/llvm-project/commit/edd0cb11bd182de8d70b7bbeba73f88d7a3714db
Author: Craig Topper <craig.topper at gmail.com>
Date: 2020-10-19 (Mon, 19 Oct 2020)
Changed paths:
M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
M llvm/test/CodeGen/X86/vector-popcnt-128.ll
M llvm/test/CodeGen/X86/vector-popcnt-256.ll
M llvm/test/CodeGen/X86/vector-popcnt-512.ll
Log Message:
-----------
[SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats
This enables these transforms for vectors:
(ctpop x) u< 2 -> (x & x-1) == 0
(ctpop x) u> 1 -> (x & x-1) != 0
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)
All enabled if CTPOP isn't Legal. This differs from the scalar
behavior where the first two are done unconditionally and the
last two are done if CTPOP isn't Legal or Custom. The Legal
check produced better results for vectors based on X86's
custom handling. Might be worth re-visiting scalars here.
I disabled the looking through truncate for vectors. The
code that creates new setcc can use the same result VT as the
original setcc even if we truncated the input. That may work
work for most scalars, but definitely wouldn't work for vectors
unless it was a vector of i1.
Fixes or at least improves PR47825
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D89346
Compare: https://github.com/llvm/llvm-project/compare/ae3625d7526f...edd0cb11bd18
More information about the All-commits
mailing list