[llvm-commits] A bug in optimization arm_ldst_opt
Jiangning Liu
jiangning.liu at arm.com
Sun Jan 6 22:09:57 PST 2013
Hi,
It seems there is a bug in optimization arm_ldst_opt, so I filed an entry at
http://llvm.org/bugs/show_bug.cgi?id=14824. Can anybody confirm this issue?
Thanks,
-Jiangning
=================================
Optimization arm_ldst_opt inserts newly generated instruction vldmia at
incorrect position.
For attached small test case ldst_opt_bug.ll, using this command line,
llc -mcpu=cortex-a9 -mattr=+neon,+neonfp ldst_opt_bug.ll
we would see incorrect instruction sequence in ldst_opt_bug.s like below,
vldr s0, [r0, #432]
vldr s5, [r1, #412]
vldr s14, [r1, #440]
vldr s10, [r0, #440]
vldmia r3, {s0, s1, s2, s3}
add.w r3, r2, #432
vldr s7, [r1, #420]
That is, instruction "vldmia r3, {s0, s1, s2, s3}" overwrites s0 just
loaded
by "vldr s0, [r0, #432]", which is incorrect according to the original
LLVM
IR semantics in ldst_opt_bug.ll.
In optimization arm_ldst_opt, before generating instruction vldmia, we have
the
following IR,
(1) %S0<def> = VLDRS %R0, 102, pred:14, pred:%noreg, %Q0<imp-def>;
mem:LD4[%arrayidx67+24](align=8)
(2) %S1<def> = VLDRS %R0, 103, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+28]
(3) %S11<def> = VLDRS %R0, 111, pred:14, pred:%noreg, %Q2<imp-use,kill>,
%Q2<imp-def>; mem:LD4[%arrayidx67+60]
(4) %S10<def> = VLDRS %R0, 110, pred:14, pred:%noreg, %Q2<imp-use,kill>,
%Q2<imp-def>; mem:LD4[%arrayidx67+56](align=8)
(5) %S1<def> = VLDRS %R0, 109, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+52]
(6) %S0<def> = VLDRS %R0, 108, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+48](align=16)
(7) %S3<def> = VLDRS %R0, 105, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+36]
(8) %S2<def> = VLDRS %R0, 104, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+32](align=32)
(9) %S7<def> = VLDRS %R1, 105, pred:14, pred:%noreg, %Q1<imp-use,kill>,
%Q1<imp-def>; mem:LD4[%arrayidx64+36]
The optimization tries to hoist instruction 7) and 8) to be able to merge
with
1) and 2) to generate vldm, because they are loading sequential memory at
offset 102*4, 103*4, 104*4, 105*4. This intention of the optimization itself
is
correct.
After hoist, the algorithm firstly generates an internal instruction
sequence,
(1)
(2)
(7)
(8)
The problem is the newly generated instruction vldm is incorrectly inserted
after instruction 8). Obviously the data dependence is violated here with
instruction 6).
The source code introducing the bug has something to do with the function
ARMLoadStoreOpt::MergeOpsUpdate,
// Try to do the merge.
MachineBasicBlock::iterator Loc = memOps[insertAfter].MBBI;
++Loc;
if (!MergeOps(MBB, Loc, Offset, Base, BaseKill, Opcode,
Pred, PredReg, Scratch, dl, Regs, ImpDefs))
return;
When Loc is (8), ++Loc is (9).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/012e5c55/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ldst_opt_bug.ll
Type: application/octet-stream
Size: 8437 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/012e5c55/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ldst_opt_bug.s
Type: application/octet-stream
Size: 2118 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130107/012e5c55/attachment-0001.obj>
More information about the llvm-commits
mailing list