[PATCH] D116804: [x86] use SETCC_CARRY instead of SBB node for select lowering

Sotiris Apostolakis via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 1 17:03:37 PST 2022


apostolakis added a comment.

In D116804#3287531 <https://reviews.llvm.org/D116804#3287531>, @spatel wrote:

> In D116804#3286618 <https://reviews.llvm.org/D116804#3286618>, @apostolakis wrote:
>
>> We noticed a 15% performance regression for llvm_test_suite/MultiSource/Benchmarks/MallocBench/gs with this patch.
>> Looking at the assembly, noticed a relatively long dependence chain including the following sequence:
>>
>>   mov   (%rbx),%rax 
>>   sbb   %rax,%rax
>>   or    %rdx,%rax 
>
> Do you have the IR for the function where that appears?

Here is a source code example with clang-trunk generated IR and assembly (https://godbolt.org/z/v66TM8W7e) that resembles the affected code and reproduces the aforementioned sequence of instructions.
Notice the non-broken (for Intel targets) dependence chain including the following instructions: callq foo1(long*, long*, y_s*) -> movq 8(%rbx), %rax -> sbbq %rax, %rax -> orq %rdx, %rax -> callq foo2(int*, int, int, int, int, int, y_s*, long, long)

> If we're going to need a tuning flag similar to `TuningPOPCNTFalseDep`, then we could just use that as a predicate hack for the code that was changed in this patch (so don't take chances and always create a real SBB with zero operand if the flag is set).

This sounds okay to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116804/new/

https://reviews.llvm.org/D116804



More information about the llvm-commits mailing list