[PATCH] D116804: [x86] use SETCC_CARRY instead of SBB node for select lowering
Sotiris Apostolakis via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 1 17:03:37 PST 2022
apostolakis added a comment.
In D116804#3287531 <https://reviews.llvm.org/D116804#3287531>, @spatel wrote:
> In D116804#3286618 <https://reviews.llvm.org/D116804#3286618>, @apostolakis wrote:
>
>> We noticed a 15% performance regression for llvm_test_suite/MultiSource/Benchmarks/MallocBench/gs with this patch.
>> Looking at the assembly, noticed a relatively long dependence chain including the following sequence:
>>
>> mov (%rbx),%rax
>> sbb %rax,%rax
>> or %rdx,%rax
>
> Do you have the IR for the function where that appears?
Here is a source code example with clang-trunk generated IR and assembly (https://godbolt.org/z/v66TM8W7e) that resembles the affected code and reproduces the aforementioned sequence of instructions.
Notice the non-broken (for Intel targets) dependence chain including the following instructions: callq foo1(long*, long*, y_s*) -> movq 8(%rbx), %rax -> sbbq %rax, %rax -> orq %rdx, %rax -> callq foo2(int*, int, int, int, int, int, y_s*, long, long)
> If we're going to need a tuning flag similar to `TuningPOPCNTFalseDep`, then we could just use that as a predicate hack for the code that was changed in this patch (so don't take chances and always create a real SBB with zero operand if the flag is set).
This sounds okay to me.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116804/new/
https://reviews.llvm.org/D116804
More information about the llvm-commits
mailing list