[PATCH] D9107: Fix CombineToPostIndexedLoadStore in DAGCombiner.cpp
Quentin Colombet
qcolombet at apple.com
Tue Jul 7 13:50:45 PDT 2015
Hi François,
A couple more comments.
It seems your patch exposed a few short coming in the load store optimizer for ARM. I am not sure we can proceed with this commit without fixing the load store optimizer first, otherwise we may regress a bunch of thing… Which is unfortunate since your patch seems a good general improvement to me.
Did you happen to run benchmark on ARM with/without this change?
That would help making our mind.
Anyway, let me look at the XFAILed test cases.
Cheers,
-Quentin
================
Comment at: test/CodeGen/ARM/automod-test.ll:27
@@ +26,3 @@
+; bne .LBB0_1
+; ======================================================
+
----------------
Looks like we could improve the load/store optimizer to catch those cases as well.
================
Comment at: test/CodeGen/ARM/avoid-cpsr-rmw.ll:31
@@ -30,3 +30,2 @@
-; CHECK-NOT: muls
%ptr1.addr.09 = phi i32* [ %add.ptr, %while.body ], [ %ptr1, %entry ]
%ptr2.addr.08 = phi i32* [ %incdec.ptr, %while.body ], [ %ptr2, %entry ]
----------------
qcolombet wrote:
> Could you comment why this check is failing now?
> I am not saying this is wrong, I’d like to check we are not missing something.
Never mind me, I found the answer in an exchange we had offline.
For the record, the assembly code generated without and with your patch are the following :
Without the patch
-----------------
ldr.w r9, [r0]
ldr r3, [r0, #4]
ldr r2, [r0, #8]
ldr.w r12, [r0, #12]
adds r0, #16 <— after this point, having muls would be wrong.
mul r3, r3, r9
mul r2, r3, r2
mul r2, r2, r12
With the patch
--------------
ldr.w r12, [r0, #4]
ldr r3, [r0, #8]
ldr.w r9, [r0, #12]
ldr r2, [r0], #16
mul r2, r12, r2
muls r2, r3, r2 <— Now the muls is necessary, since we got rid of the adds.
mul r2, r2, r9
http://reviews.llvm.org/D9107
More information about the llvm-commits
mailing list