[llvm-bugs] [Bug 36243] New: Missed optimization: inefficient codegen for __builtin_addc

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Feb 5 14:34:12 PST 2018


https://bugs.llvm.org/show_bug.cgi?id=36243

            Bug ID: 36243
           Summary: Missed optimization: inefficient codegen for
                    __builtin_addc
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: koriakin at 0x04.net
                CC: llvm-bugs at lists.llvm.org

clang has __builtin_addc* functions, which are supposed to emit hardware
add-with-carry instructions.  However, there is no corresponding intrinsic on
LLVM side, so clang emits a sequence of instructions that is only recognized
and folded to a single hw instruction in two cases:

- carry input is 0, or
- carry output is unused

This means that any carry chains longer than 2 result in inefficient code:

void add3(
    unsigned long long *restrict a,
    unsigned long long *restrict b,
    unsigned long long *restrict c
) {
    unsigned long long cf = 0;
    c[0] = __builtin_addcll(a[0], b[0], cf, &cf);
    c[1] = __builtin_addcll(a[1], b[1], cf, &cf);
    c[2] = __builtin_addcll(a[2], b[2], cf, &cf);
}

Compiles to:

add3:                                   # @add3
        .cfi_startproc
# BB#0:
        movq    (%rdi), %rax
        movq    (%rsi), %r8
        leaq    (%rax,%r8), %rcx
        movq    %rcx, (%rdx)
        movq    8(%rdi), %rcx
        addq    8(%rsi), %rcx
        setb    %r9b
        addq    %r8, %rax
        adcq    $0, %rcx
        setb    %al
        orb     %r9b, %al
        movzbl  %al, %eax
        movq    %rcx, 8(%rdx)
        movq    16(%rsi), %rcx
        addq    16(%rdi), %rcx
        addq    %rax, %rcx
        movq    %rcx, 16(%rdx)
        retq

I suppose we're going to need a new target-independent generic intrinsic,
say { iX, i1 } @llvm.uadd.with.overflow.carry.iX(iX, iX, i1) (and a
corresponding one for subtraction as well) and map it to ISD::ADDE /
ISD::ADDCARRY.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180205/10a26b60/attachment-0001.html>


More information about the llvm-bugs mailing list