[llvm-bugs] [Bug 44783] New: Wrong code generated for shufflevector

Tue Feb 4 14:26:24 PST 2020

https://bugs.llvm.org/show_bug.cgi?id=44783

            Bug ID: 44783
           Summary: Wrong code generated for shufflevector
           Product: libraries
           Version: trunk
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: AArch64
          Assignee: unassignedbugs at nondot.org
          Reporter: scw at google.com
                CC: arnaud.degrandmaison at arm.com,
                    llvm-bugs at lists.llvm.org, peter.smith at linaro.org,
                    Ties.Stuij at arm.com

Created attachment 23090
  --> https://bugs.llvm.org/attachment.cgi?id=23090&action=edit
IR reproducer

C++ code snippet:

typedef long long raw128 __attribute__((vector_size(16)));                      
typedef long long raw256 __attribute__((vector_size(32)));                      
void f() {                                                                      
  raw128 a = {784, 785};                                                        
  raw256 c = __builtin_shufflevector(a, a, 0, 1, 0, 1);                         
  (void) c;                                                                     
}

IR attached (t.ll).

Command: llc -O0 --target=aarch64 t.ll

Result:

        .text                                                                   
        .file   "t.cc"                                                          
        .globl  _Z1fv                   // -- Begin function _Z1fv              
        .p2align        2                                                       
        .type   _Z1fv, at function                                                 
_Z1fv:                                  // @_Z1fv                               
.L_Z1fv$local:                                                                  
// %bb.0:                               // %entry                               
        sub     sp, sp, #48             // =48                                  
        mov     x8, #784                                                        
        mov     x9, #785                                                        
                                        // implicit-def: $q0                    
        fmov    d0, x8                                                          
        mov     v0.d[1], x9                                                     
        str     q0, [sp, #32]                                                   
        ldr     q0, [sp, #32]                                                   
        mov     v1.16b, v0.16b                                                  
        mov     d2, v0.d[1]                                                     
        mov     v3.16b, v0.16b                                                  
        mov     d4, v0.d[1]                                                     
        str     d1, [sp]                                                        
        str     d1, [sp, #8]                                                    
        str     d3, [sp, #16]                                                   
        str     d3, [sp, #24]                                                   
        add     sp, sp, #48             // =48                                  
        ret                                                                     
.Lfunc_end0:                                                                    
        .size   _Z1fv, .Lfunc_end0-_Z1fv                                        
                                        // -- End function 

Expected result:

The four "str" toward the end of the function should be storing d1, d2, d3, d4
onto the stack, instead of d1 twice and d3 twice.

I tracked it down to AArch64InstructionSelector::preISelLower() on the G_STORE
in the following snippet:

  %11:fpr(s64), %12:fpr(s64) = G_UNMERGE_VALUES %5:fpr(<2 x s64>)
  %22:gpr(s64) = COPY %12:fpr(s64)
  G_STORE %22:gpr(s64), %17:gpr(p0) :: (store 8 into %ir.c + 24, align 16)

AArch64InstructionSelector::contractCrossBankCopyIntoStore() calls
llvm::getDefIgnoringCopies() to get the instruction where %22 originates and
got G_UNMERGE_VALUES. It then substitutes %22 with the operand 0 of
G_UNMERGE_VALUES, i.e. %11. However, G_UNMERGE_VALUES has two outputs and %22
is actually from %12, the operand 1.

AArch64InstructionSelector::contractCrossBankCopyIntoStore() seems to assume
that each MIR has at most one output, but that's not the case here.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200204/f159a67b/attachment.html>