[llvm-commits] A bug in optimization arm_ldst_opt

Sun Jan 6 22:09:57 PST 2013

Hi,

It seems there is a bug in optimization arm_ldst_opt, so I filed an entry at
http://llvm.org/bugs/show_bug.cgi?id=14824. Can anybody confirm this issue?

Thanks,

-Jiangning

=================================

Optimization arm_ldst_opt inserts newly generated instruction vldmia at

incorrect position.

For attached small test case ldst_opt_bug.ll, using this command line,

llc -mcpu=cortex-a9 -mattr=+neon,+neonfp ldst_opt_bug.ll

we would see incorrect instruction sequence in ldst_opt_bug.s like below,

        vldr    s0, [r0, #432]

        vldr    s5, [r1, #412]

        vldr    s14, [r1, #440]

        vldr    s10, [r0, #440]

        vldmia  r3, {s0, s1, s2, s3}

        add.w   r3, r2, #432

        vldr    s7, [r1, #420]

That is, instruction "vldmia  r3, {s0, s1, s2, s3}" overwrites s0 just
loaded

by "vldr    s0, [r0, #432]", which is incorrect according to the original
LLVM

IR semantics in ldst_opt_bug.ll.

In optimization arm_ldst_opt, before generating instruction vldmia, we have
the

following IR,

(1) %S0<def> = VLDRS %R0, 102, pred:14, pred:%noreg, %Q0<imp-def>;

mem:LD4[%arrayidx67+24](align=8)

(2) %S1<def> = VLDRS %R0, 103, pred:14, pred:%noreg, %Q0<imp-use,kill>,

%Q0<imp-def>; mem:LD4[%arrayidx67+28]

(3) %S11<def> = VLDRS %R0, 111, pred:14, pred:%noreg, %Q2<imp-use,kill>,

%Q2<imp-def>; mem:LD4[%arrayidx67+60]

(4) %S10<def> = VLDRS %R0, 110, pred:14, pred:%noreg, %Q2<imp-use,kill>,

%Q2<imp-def>; mem:LD4[%arrayidx67+56](align=8)

(5) %S1<def> = VLDRS %R0, 109, pred:14, pred:%noreg, %Q0<imp-use,kill>,

%Q0<imp-def>; mem:LD4[%arrayidx67+52]

(6) %S0<def> = VLDRS %R0, 108, pred:14, pred:%noreg, %Q0<imp-use,kill>,

%Q0<imp-def>; mem:LD4[%arrayidx67+48](align=16)

(7) %S3<def> = VLDRS %R0, 105, pred:14, pred:%noreg, %Q0<imp-use,kill>,

%Q0<imp-def>; mem:LD4[%arrayidx67+36]

(8) %S2<def> = VLDRS %R0, 104, pred:14, pred:%noreg, %Q0<imp-use,kill>,

%Q0<imp-def>; mem:LD4[%arrayidx67+32](align=32)

(9) %S7<def> = VLDRS %R1, 105, pred:14, pred:%noreg, %Q1<imp-use,kill>,

%Q1<imp-def>; mem:LD4[%arrayidx64+36]

The optimization tries to hoist instruction 7) and 8) to be able to merge
with

1) and 2) to generate vldm, because they are loading sequential memory at

offset 102*4, 103*4, 104*4, 105*4. This intention of the optimization itself
is

correct.

After hoist, the algorithm firstly generates an internal instruction
sequence,

(1)

(2)

(7)

(8)

The problem is the newly generated instruction vldm is incorrectly inserted

after instruction 8). Obviously the data dependence is violated here with

instruction 6).

The source code introducing the bug has something to do with the function

ARMLoadStoreOpt::MergeOpsUpdate,

  // Try to do the merge.

  MachineBasicBlock::iterator Loc = memOps[insertAfter].MBBI;

  ++Loc;

  if (!MergeOps(MBB, Loc, Offset, Base, BaseKill, Opcode,

                Pred, PredReg, Scratch, dl, Regs, ImpDefs))

    return;

When Loc is (8), ++Loc is (9).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/012e5c55/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ldst_opt_bug.ll
Type: application/octet-stream
Size: 8437 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/012e5c55/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ldst_opt_bug.s
Type: application/octet-stream
Size: 2118 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/012e5c55/attachment-0001.obj>