[LLVMbugs] [Bug 11730] New: [AVX, SSE] inefficient code generated for vector compares due to sext to i32 moved across phi node

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Jan 9 10:25:25 PST 2012


http://llvm.org/bugs/show_bug.cgi?id=11730

             Bug #: 11730
           Summary: [AVX,SSE] inefficient code generated for vector
                    compares due to sext to i32 moved across phi node
           Product: new-bugs
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: matt at pharr.org
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


Created attachment 7856
  --> http://llvm.org/bugs/attachment.cgi?id=7856
sext instructions immediately after vector compares, good code

The attached two test cases demonstrate an interesting situation that leads to
much worse code than usual for vector computation on SSE and AVX (at least).

As context, when doing vector comparisons with those targets, it's usually
important to immediately sext the <n x i1> result of the comparison to an <n x
i32> value.  (Which is what the instructions actually return; the x86 code
generator picks up on this pattern and then emits just the desired vector
comparison instruction.)

In the attached test case, the originally-generated code had the sexts right
after the vector compares, but then a later optimization pass noticed that two
values feeding into a phi node both had sext to <n x i32> after them, so it
removed the two original sexts and added a new one after the phi node.  As a
result, the x86 code generator doesn't pick up the pattern and generates very
inefficient code that first painfully does a zext conversion to <n x i32> and
then later painfully converts this back to the originally desired sext value.

The two attachments are the same except for one has the sexts placed back after
the vector compares and the other has the one with the single sext after the
phi node.  When run through llc -mattr=+avx, the second one has big sequences
of the following as a result:

    movzbl    %r15b, %edx
    vpinsrw    $0, %edx, %xmm0, %xmm0
    movzbl    %r12b, %edx
    vpinsrw    $1, %edx, %xmm0, %xmm0
    movzbl    %bl, %edx
    vpinsrw    $2, %edx, %xmm0, %xmm0
    movzbl    %r9b, %edx
    vpinsrw    $3, %edx, %xmm0, %xmm0
    movzbl    %r10b, %edx
    vpinsrw    $4, %edx, %xmm0, %xmm0
    movzbl    %r14b, %edx
    vpinsrw    $5, %edx, %xmm0, %xmm0
    movzbl    %r11b, %edx
    vpinsrw    $6, %edx, %xmm0, %xmm0
    movzbl    %sil, %edx
    vpinsrw    $7, %edx, %xmm0, %xmm6
    vpextrw    $1, %xmm6, %edx
    vmovd    %xmm6, %esi
    vmovd    %esi, %xmm0
    vpinsrd    $1, %edx, %xmm0, %xmm0
    vpextrw    $2, %xmm6, %edx
    vpinsrd    $2, %edx, %xmm0, %xmm0
    vpextrw    $3, %xmm6, %edx
    vpinsrd    $3, %edx, %xmm0, %xmm0
    vpslld    $31, %xmm0, %xmm0
    vpsrad    $31, %xmm0, %xmm7
    vpextrw    $5, %xmm6, %edx
    vpextrw    $4, %xmm6, %esi
    vmovd    %esi, %xmm0
    vpinsrd    $1, %edx, %xmm0, %xmm0
    vpextrw    $6, %xmm6, %edx
    vpinsrd    $2, %edx, %xmm0, %xmm0
    vpextrw    $7, %xmm6, %edx
    vpinsrd    $3, %edx, %xmm0, %xmm0
    vpslld    $31, %xmm0, %xmm0
    vpsrad    $31, %xmm0, %xmm0
    vinsertf128    $1, %xmm0, %ymm7, %ymm0

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list