[llvm-dev] [ARM] Thumb code-gen for 8-bit imm arguments results in extra reg copies

Prathamesh Kulkarni via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 16 02:22:18 PDT 2020


Hi,
For the following test-case:

void foo(unsigned, unsigned);
void f()
{
  foo(10, 20);
  foo(10, 20);
}

clang --target=arm-linux-gnueabi -mthumb -O2 generates:

        push    {r4, r5, r7, lr}
        movs    r4, #10
        movs    r5, #20
        movs    r0, r4
        movs    r1, r5
        bl      foo
        movs    r0, r4
        movs    r1, r5
        bl      foo
        pop     {r4, r5, r7}
        pop     {r0}
        bx      r0

Is there any particular reason for loading constants in r4, r5 and then copying
them into r0, r1 rather than loading them directly in r0, r1 before both calls ?

I suppose for higher immediate values, that require either loading from memory,
or need multiple instructions to construct, it is reasonable to pre-compute them
into registers, but for 8-bit immediate values, would it be more
beneficial to load
them directly in argument registers instead ?

Looking at the ISel dump, for the above test-case:
  %0:tgpr, dead $cpsr = tMOVi8 10, 14, $noreg
  %1:tgpr, dead $cpsr = tMOVi8 20, 14, $noreg
  $r0 = COPY %0:tgpr
  $r1 = COPY %1:tgpr

IIUC, there are a couple of reasons why this happens:
(a) tMOVi8 pattern isn't marked with isRematerializable, isAsCheapAsMove,
and isMoveImm.
(b) After annotating the pattern with above flags,
RegisterCoalescer::reMaterializeTrivialDef still bails out because
the above assignment has 2 definitions, with only one live definition.

To address this issue, I attached a hackish patch that
(a) Marks tMOVi8 pattern with:
let isReMaterializable = 1, isAsCheapAsAMove = 1, isMoveImm = 1
I am not sure if this is entirely correct ?

(b) Modifies RegisterCoalescer::reMaterializeTrivialDef and
TargetInstrInfo::isReallyTriviallyReMaterializableGeneric to check
for single live def, instead of single def.

Does the patch look in the right direction ?
For the above test, it generates:

        push    {r7, lr}
        movs    r0, #10
        movs    r1, #20
        bl      foo
        movs    r0, #10
        movs    r1, #20
        bl      foo
        pop     {r7}
        pop     {r0}
        bx      r0

However I am not sure if the patch causes correctness issues.
Testing with make check-llvm showed a relatively larger fallout (36 fail tests),
which I am investigating. Just wanted to ask, is there something
obviously wrong with the patch or the idea ?

Thanks,
Prathamesh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-612-2.diff
Type: application/octet-stream
Size: 2132 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200616/dcda8ec1/attachment.obj>


More information about the llvm-dev mailing list