[all-commits] [llvm/llvm-project] cf8fad: Match (xor TSize - 1, ctlz) to `bsr` instead of `l...
goldsteinn via All-commits
all-commits at lists.llvm.org
Mon Feb 6 12:16:49 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: cf8fadcf9b9362bff30e31cce06b516aa1156ce1
https://github.com/llvm/llvm-project/commit/cf8fadcf9b9362bff30e31cce06b516aa1156ce1
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/clz.ll
Log Message:
-----------
Match (xor TSize - 1, ctlz) to `bsr` instead of `lzcnt` + `xor`
Was previously de-optimizating if -march supported lzcnt as there is
no reason to add the extra instruction.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D141464
Commit: 3857d9decc4db2a0f14fd7cb7cd69be55f12cc4a
https://github.com/llvm/llvm-project/commit/3857d9decc4db2a0f14fd7cb7cd69be55f12cc4a
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/bmi-out-of-order.ll
Log Message:
-----------
Search through associative operators for BMI patterns (BLSI, BLSR, BLSMSK)
(a & (-b)) & b is often lowered as:
%sub = sub i32 0, %b
%and0 = and i32 %sub, %a
%and1 = and i32 %and0, %b
Which won't get detected by the BLSI pattern as b & -b are never in
the same SDNode.
This patch will do a small search through associative operators and try
and place BMI patterns in the same node so they will hit the pattern.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D141179
Commit: 725b72c1fa608c886a1a5dbb75df23a05e91d5e8
https://github.com/llvm/llvm-project/commit/725b72c1fa608c886a1a5dbb75df23a05e91d5e8
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/lib/Target/X86/X86InstrInfo.td
M llvm/test/CodeGen/X86/GlobalISel/select-blsi.mir
M llvm/test/CodeGen/X86/GlobalISel/select-blsr.mir
M llvm/test/CodeGen/X86/bmi-out-of-order.ll
Log Message:
-----------
Only match BMI (BLSR, BLSI, BLSMSK) if the add/sub op is single use
If the add/sub is not single use, it will need to be materialized
later, in which case using the BMI instruction is a de-optimization in
terms of code-size and throughput.
i.e:
```
// Good
leal -1(%rdi), %eax
andl %eax, %eax
xorl %eax, %esi
...
```
```
// Unecessary BMI (lower throughput, larger code size)
leal -1(%rdi), %eax
blsr %edi, %eax
xorl %eax, %esi
...
```
Note, this may cause more `mov` instructions to be emitted sometimes
because BMI instructions only have 1 src and write-only to dst. A
better approach may be to only avoid BMI for (and/xor X, (add/sub
0/-1, X)) if this is the last use of X but NOT the last use of
(add/sub 0/-1, X).
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D141180
Commit: ee5585ed09aff2e54cb540fad4c33f0c93626b1b
https://github.com/llvm/llvm-project/commit/ee5585ed09aff2e54cb540fad4c33f0c93626b1b
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/lib/CodeGen/BranchFolding.cpp
M llvm/test/CodeGen/X86/add.ll
M llvm/test/CodeGen/X86/atom-pad-short-functions.ll
M llvm/test/CodeGen/X86/avx512-i1test.ll
M llvm/test/CodeGen/X86/bmi.ll
M llvm/test/CodeGen/X86/brcond.ll
M llvm/test/CodeGen/X86/btq.ll
M llvm/test/CodeGen/X86/cmp-merge.ll
M llvm/test/CodeGen/X86/cmp.ll
M llvm/test/CodeGen/X86/comi-flags.ll
M llvm/test/CodeGen/X86/extern_weak.ll
M llvm/test/CodeGen/X86/fold-rmw-ops.ll
M llvm/test/CodeGen/X86/fp-strict-scalar-cmp-fp16.ll
M llvm/test/CodeGen/X86/fp-strict-scalar-cmp.ll
M llvm/test/CodeGen/X86/funnel-shift.ll
M llvm/test/CodeGen/X86/jump_sign.ll
M llvm/test/CodeGen/X86/neg_cmp.ll
M llvm/test/CodeGen/X86/or-branch.ll
M llvm/test/CodeGen/X86/peep-test-4.ll
M llvm/test/CodeGen/X86/pr37025.ll
M llvm/test/CodeGen/X86/pr37063.ll
M llvm/test/CodeGen/X86/rd-mod-wr-eflags.ll
M llvm/test/CodeGen/X86/segmented-stacks.ll
M llvm/test/CodeGen/X86/sibcall.ll
M llvm/test/CodeGen/X86/slow-incdec.ll
M llvm/test/CodeGen/X86/sqrt-partial.ll
M llvm/test/CodeGen/X86/switch-bt.ll
M llvm/test/CodeGen/X86/tail-opts.ll
M llvm/test/CodeGen/X86/tailcall-cgp-dup.ll
M llvm/test/CodeGen/X86/tailcall-extract.ll
M llvm/test/CodeGen/X86/xor-icmp.ll
Log Message:
-----------
Recommit "Improve and enable folding of conditional branches with tail calls." (2nd Try)
Improve and enable folding of conditional branches with tail calls.
1. Make it so that conditional tail calls can be emitted even when
there are multiple predecessors.
2. Don't guard the transformation behind -Os. The rationale for
guarding it was static-prediction can be affected by whether the
branch is forward of backward. This is no longer true for almost any
X86 cpus (anything newer than `SnB`) so is no longer a meaningful
concern.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D140931
Commit: 19c766f7423abb1808c4de94ea0a0f09ef0a6ada
https://github.com/llvm/llvm-project/commit/19c766f7423abb1808c4de94ea0a0f09ef0a6ada
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/test/Transforms/InstCombine/icmp-mul.ll
Log Message:
-----------
Add tests for folding (icmp UnsignedPred X * Z, Y * Z) -> (icmp UnsignedPred X, Y); NFC
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D142785
Commit: 2a3732f934b1ef46fc8f0fdae77836c6604533cb
https://github.com/llvm/llvm-project/commit/2a3732f934b1ef46fc8f0fdae77836c6604533cb
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
M llvm/test/Transforms/InstCombine/icmp-mul.ll
Log Message:
-----------
Add transform for `(mul X, OddC) eq/ne N * C` --> `X eq/ne N`
We previously only did this if the `mul` was `nuw`, but it works for
any odd value.
Alive2 Links:
EQ: https://alive2.llvm.org/ce/z/6_HPZ5
NE: https://alive2.llvm.org/ce/z/c34qSU
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D143026
Commit: abbd256a810a0b0c92dda88a3050fc85cb604a9c
https://github.com/llvm/llvm-project/commit/abbd256a810a0b0c92dda88a3050fc85cb604a9c
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-06 (Mon, 06 Feb 2023)
Changed paths:
M llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
M llvm/test/Transforms/InstCombine/icmp-mul.ll
Log Message:
-----------
Improve transforms for (icmp uPred X * Z, Y * Z) -> (icmp uPred X, Y)
Several cases where missing.
1. `(icmp eq/ne X*Z, Y*Z) [if Z % 2 != 0] -> (icmp eq/ne X, Y)`
EQ: https://alive2.llvm.org/ce/z/6_HPZ5
NE: https://alive2.llvm.org/ce/z/c34qSU
There was previously an implementation of this that work of `Y`
was non-constant, but it was missing if `Y*Z` evaluated to a
constant and/or `nsw`/`nuw` where both false. As well it only
worked if `Z` was a constant but we can check 1s bit of
`KnownBits` to cover more cases.
2. `(icmp eq/ne X*Z, Y*Z) [if Z != 0 and nsw(X*Y) and nsw(Y*Z)] -> (icmp eq/ne X, Y)`
EQ: https://alive2.llvm.org/ce/z/6SdAG6
NE: https://alive2.llvm.org/ce/z/fjsq_b
This was previously implemented only to work if `Z` was constant,
but we can use `isKnownNonZero` to cover more cases.
3. `(icmp uPred X*Y, Y*Z) [if Z != 0 and nuw(X*Y) and nuw(X*Y)] -> (icmp uPred X, Y)`
EQ: https://alive2.llvm.org/ce/z/FqWQLX
NE: https://alive2.llvm.org/ce/z/2gHrd2
ULT: https://alive2.llvm.org/ce/z/MUAWgZ
ULE: https://alive2.llvm.org/ce/z/szQQ2L
UGT: https://alive2.llvm.org/ce/z/McVUdu
UGE: https://alive2.llvm.org/ce/z/95uyC8
This was previously implemented only for `eq/ne` cases. As well
only if `Z` was constant, but again we can use `isKnownNonZero` to
cover more cases.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D142786
Compare: https://github.com/llvm/llvm-project/compare/3b73fc320f91...abbd256a810a
More information about the All-commits
mailing list