[LLVMdev] Possible missed optimization?

Jakob Stoklund Olesen stoklund at 2pi.dk
Sun Sep 5 11:04:00 PDT 2010

On Sep 4, 2010, at 5:40 PM, Eli Friedman wrote:

> If you want to take a look at this yourself, the issue is easy to
> reproduce with Thumb1:

Thanks, Eli. Nice catch!

This IR:

target triple = "thumbv5-u-u"

define arm_aapcscc i64 @foo(i64 %a, i64 %b) nounwind readnone {
  %xor = xor i64 %a, 18                           ; <i64> [#uses=1]
  %xor2 = xor i64 %xor, %b                        ; <i64> [#uses=1]
  ret i64 %xor2

produces these instructions before coalescing:

4L      %reg16387<def> = COPY %R3<kill>
12L     %reg16386<def> = COPY %R2<kill>
28L     %reg16384<def> = COPY %R0<kill>
36L     %reg16388<def> = COPY %reg16385<kill>
44L     %reg16388<def>, %CPSR<def,dead> = tEOR %reg16388, %reg16387<kill>, pred:14, pred:%reg0
56L     %reg16389<def> = COPY %reg16384<kill>
64L     %reg16389<def>, %CPSR<def,dead> = tEOR %reg16389, %reg16386<kill>, pred:14, pred:%reg0
76L     %reg16390<def>, %CPSR<def,dead> = tMOVi8 18, pred:14, pred:%reg0
88L     %reg16391<def> = COPY %reg16390<kill>
96L     %reg16391<def>, %CPSR<def,dead> = tEOR %reg16391, %reg16389<kill>, pred:14, pred:%reg0
108L    %R0<def> = COPY %reg16391<kill>
116L    %R1<def> = COPY %reg16388<kill>
128L    tBX_RET %R0<imp-use,kill>, %R1<imp-use,kill>

and after:

44L     %R1<def>, %CPSR<def,dead> = tEOR %R1, %R3<kill>, pred:14, pred:%reg0
56L     %reg16389<def> = COPY %R0<kill>
64L     %reg16389<def>, %CPSR<def,dead> = tEOR %reg16389, %R2<kill>, pred:14, pred:%reg0
76L     %R0<def>, %CPSR<def,dead> = tMOVi8 18, pred:14, pred:%reg0
96L     %R0<def>, %CPSR<def,dead> = tEOR %R0, %reg16389<kill>, pred:14, pred:%reg0
128L    tBX_RET %R0<imp-use,kill>, %R1<imp-use,kill>

We see, as Borja pointed out, that %R0 from the 108L COPY has been joined with %reg16391 and %reg16390 so it is too late to commute the xor.

Passing -disable-physical-join to prevent the %R0 sabotage, we get:

4L      %reg16387<def> = COPY %R3<kill>; tGPR:%reg16387
12L     %reg16386<def> = COPY %R2<kill>; tGPR:%reg16386
20L     %reg16388<def> = COPY %R1<kill>; tGPR:%reg16388
28L     %reg16389<def> = COPY %R0<kill>; tGPR:%reg16389
44L     %reg16388<def>, %CPSR<def,dead> = tEOR %reg16388, %reg16387<kill>, pred:14, pred:%reg0; tGPR:%reg16388,16387
64L     %reg16389<def>, %CPSR<def,dead> = tEOR %reg16389, %reg16386<kill>, pred:14, pred:%reg0; tGPR:%reg16389,16386
76L     %reg16391<def>, %CPSR<def,dead> = tMOVi8 18, pred:14, pred:%reg0; tGPR:%reg16391
96L     %reg16391<def>, %CPSR<def,dead> = tEOR %reg16391, %reg16389<kill>, pred:14, pred:%reg0; tGPR:%reg16391,16389
108L    %R0<def> = COPY %reg16391<kill>; tGPR:%reg16391
116L    %R1<def> = COPY %reg16388<kill>; tGPR:%reg16388
128L    tBX_RET %R0<imp-use,kill>, %R1<imp-use,kill>

It is not easy to see here that the 96L tEOR should be commuted. You would have to notice that the hints for %reg16389 and %reg16391 are clashing.

After register allocation with hinting it becomes:

        %R1<def>, %CPSR<def,dead> = tEOR %R1, %R3<kill>, pred:14, pred:%reg0
        %R0<def>, %CPSR<def,dead> = tEOR %R0, %R2<kill>, pred:14, pred:%reg0
        %R2<def>, %CPSR<def,dead> = tMOVi8 18, pred:14, pred:%reg0
        %R2<def>, %CPSR<def,dead> = tEOR %R2, %R0<kill>, pred:14, pred:%reg0
        %R0<def> = COPY %R2<kill>
        tBX_RET %R0<imp-use,kill>, %R1<imp-use,kill>

There are two fundamental deficiencies here:

1. The coalescer is not very good at handling conflicting joins. The examples show that different orders of joining can give different results. The coalescer uses heuristics to pick an order. It doesn't try to find an optimal order.

2. Commuting two-address instructions is not really integrated into the coalescer algorithm. It is more of an afterthought, calling RemoveCopyByCommutingDef when a copy could otherwise not be removed.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100905/8bda9d8c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1929 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100905/8bda9d8c/attachment.bin>

More information about the llvm-dev mailing list