[LLVMbugs] [Bug 14824] New: Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position

Sun Jan 6 21:58:50 PST 2013

http://llvm.org/bugs/show_bug.cgi?id=14824

             Bug #: 14824
           Summary: Optimization arm_ldst_opt inserts newly generated
                    instruction vldmia at incorrect position
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: ARM
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: liujiangning1 at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 9821
  --> http://llvm.org/bugs/attachment.cgi?id=9821
The small test case verifying the bug in arm_ldst_opt

Hi,

Optimization arm_ldst_opt inserts newly generated instruction vldmia at
incorrect position.

For attached small test case ldst_opt_bug.ll, using this command line,

llc -mcpu=cortex-a9 -mattr=+neon,+neonfp ldst_opt_bug.ll

we would see incorrect instruction sequence in ldst_opt_bug.s like below,

        vldr    s0, [r0, #432]
        vldr    s5, [r1, #412]
        vldr    s14, [r1, #440]
        vldr    s10, [r0, #440]
        vldmia  r3, {s0, s1, s2, s3}
        add.w   r3, r2, #432
        vldr    s7, [r1, #420]

That is, instruction "vldmia  r3, {s0, s1, s2, s3}" overwrites s0 just loaded
by "vldr    s0, [r0, #432]", which is incorrect according to the original LLVM
IR semantics in ldst_opt_bug.ll.

In optimization arm_ldst_opt, before generating instruction vldmia, we have the
following IR,

(1) %S0<def> = VLDRS %R0, 102, pred:14, pred:%noreg, %Q0<imp-def>;
mem:LD4[%arrayidx67+24](align=8)
(2) %S1<def> = VLDRS %R0, 103, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+28]
(3) %S11<def> = VLDRS %R0, 111, pred:14, pred:%noreg, %Q2<imp-use,kill>,
%Q2<imp-def>; mem:LD4[%arrayidx67+60]
(4) %S10<def> = VLDRS %R0, 110, pred:14, pred:%noreg, %Q2<imp-use,kill>,
%Q2<imp-def>; mem:LD4[%arrayidx67+56](align=8)
(5) %S1<def> = VLDRS %R0, 109, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+52]
(6) %S0<def> = VLDRS %R0, 108, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+48](align=16)
(7) %S3<def> = VLDRS %R0, 105, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+36]
(8) %S2<def> = VLDRS %R0, 104, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+32](align=32)
(9) %S7<def> = VLDRS %R1, 105, pred:14, pred:%noreg, %Q1<imp-use,kill>,
%Q1<imp-def>; mem:LD4[%arrayidx64+36]

The optimization tries to hoist instruction 7) and 8) to be able to merge with
1) and 2) to generate vldm, because they are loading sequential memory at
offset 102*4, 103*4, 104*4, 105*4. This intention of the optimization itself is
correct.

After hoist, the algorithm firstly generates an internal instruction sequence,

(1)
(2)
(7)
(8)

The problem is the newly generated instruction vldm is incorrectly inserted
after instruction 8). Obviously the data dependence is violated here with
instruction 6).

The source code introducing the bug is has something to do with the function
ARMLoadStoreOpt::MergeOpsUpdate,

  // Try to do the merge.
  MachineBasicBlock::iterator Loc = memOps[insertAfter].MBBI;
  ++Loc;
  if (!MergeOps(MBB, Loc, Offset, Base, BaseKill, Opcode,
                Pred, PredReg, Scratch, dl, Regs, ImpDefs))
    return;

When Loc is (8), ++Loc is (9).

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.