[llvm-bugs] [Bug 34107] New: (Windows) ARM division libcall handling broken - result registers clobbered?

Mon Aug 7 12:53:15 PDT 2017

https://bugs.llvm.org/show_bug.cgi?id=34107

            Bug ID: 34107
           Summary: (Windows) ARM division libcall handling broken -
                    result registers clobbered?
           Product: libraries
           Version: 5.0
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: ARM
          Assignee: unassignedbugs at nondot.org
          Reporter: martin at martin.st
                CC: llvm-bugs at lists.llvm.org

Since SVN r305625, libcalls to e.g. __rt_udiv64 for 64 bit division on Windows
on ARM, which place their results in r0-r1 (and the remainder in r2-r3), can
occasionally get broken.

I've bisected this regression down to the following commit:

    RegScavenging: Add scavengeRegisterBackwards()

    Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse
    to place spills as the very first instruciton of a basic block and thus
    artifically increase pressure (test in
    test/CodeGen/PowerPC/scavenging.mir:spill_at_begin)

    This is a variant of scavengeRegister() that works for
    enterBasicBlockEnd()/backward(). The benefit of the backward mode is
    that it is not affected by incomplete kill flags.

    This patch also changes
    PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
    scavenger in backwards mode.

    Differential Revision: http://reviews.llvm.org/D21885

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305625
91177308-0d3
4-0410-b5e6-96231b3b80d8

So far I've been able to reproduce the issue with a pretty big file.

An example of the output of a code snippet which I think might be part of the
issue I'm seeing is this:

    3136:       00 f0 00 f8     bl      #0
                        00003136:  IMAGE_REL_ARM_BLX23T __rt_udiv64 <-- output
in r0-r1
    313a:       0d f5 86 61     add.w   r1, sp, #1072 <-- clobbering r1 with an
unrelated pointer
    313e:       4f f0 01 0e     mov.w   lr, #1
    3142:       01 f5 89 2c     add.w   r12, r1, #280576
    3146:       11 eb d0 72     adds.w  r2, r1, r0, lsr #31 <-- using r1 from
the __rt_udiv64 call

The diff in the generated code for this segment from before and after this
commit is as follows:

        00 f0 00 f8     bl      #0
      IMAGE_REL_ARM_BLX23T      __rt_udiv64
      IMAGE_REL_ARM_BLX23T      __rt_udiv64
+       0d f5 86 61     add.w   r1, sp, #1072
        4f f0 01 0e     mov.w   lr, #1
-       cd f8 00 e0     str.w   lr, [sp]
-       0d f5 86 6e     add.w   lr, sp, #1072
+       01 f5 89 2c     add.w   r12, r1, #280576
        11 eb d0 72     adds.w  r2, r1, r0, lsr #31
        16 bf   itet    ne

Thus, this clearly looks broken.

The output from the __rt_udiv64 call gets passed to the following function:

static av_always_inline av_const int32_t av_clipl_int32_arm(int64_t a)
{   
    int x, y;
    __asm__ ("adds   %1, %R2, %Q2, lsr #31  \n\t"
             "itet   ne                     \n\t"
             "mvnne  %1, #1<<31             \n\t"
             "moveq  %0, %Q2                \n\t"
             "eorne  %0, %1,  %R2, asr #31  \n\t"
             : "=r"(x), "=&r"(y) : "r"(a) : "cc");
    return x;
}

Is this an issue with the division libcall itself, missing to flag that all of
these registers actually are used? Or is the inline assembly somehow losing
track of that both halves of the 64 bit variable are used? (When used as input
to the inline assembly snippet where it is passed with a "r" type, used via the
%R2/%Q2 names.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170807/e444b3dd/attachment.html>