[llvm-commits] [PATCH] Fix nondeterministic codegen in stack-coloring pass

Thu Nov 15 11:25:18 PST 2012

Hello,

in my attempts to complete a "bootstrap" of clang on PowerPC, I'm still
running into spurious differences between the stage-2 and stage-3 compiler
binaries.  One large set of such differences was due to different
assignments of variables to stack slots.  It turns out that this happens in
the stack-coloring pass (CodeGen/StackColoring.cpp).  In one example, that
pass was merging stack slots like this (in the stage-2 binary):
   frame index 0 --> merged into frame index 30
   frame index 1 --> merged into frame index 25
In the stage-3 binary we have instead:
   frame index 0 --> merged into frame index 25
   frame index 1 --> merged into frame index 30

Comparing the machine instruction dumps after the stack-coloring pass
shows:
     Predecessors according to CFG: BB#156
        %vreg573<def> = LD 10, %vreg192; mem:LD8[<unknown>]
G8RC:%vreg573,%vreg192
        STD %vreg573<kill>, 0, <fi#26>; mem:ST8
[%ReadersWriters.i.i](tbaa=!"any pointer") G8RC:%vreg573
-       STD %vreg126, 0, <fi#30>; mem:ST8[%Readers.i](tbaa=!"any pointer")
G8RC:%vreg126
-       STD %vreg125, 0, <fi#25>; mem:ST8
[%AllocRelatedValues.i.i](tbaa=!"any pointer") G8RC:%vreg125
-       %vreg574<def> = LD 0, <fi#30>; mem:LD8[<unknown>] G8RC:%vreg574
+       STD %vreg126, 0, <fi#25>; mem:ST8
[%AllocRelatedValues.i.i](tbaa=!"any pointer") G8RC:%vreg126
+       STD %vreg125, 0, <fi#30>; mem:ST8[%Readers.i](tbaa=!"any pointer")
G8RC:%vreg125
+       %vreg574<def> = LD 0, <fi#25>; mem:LD8[<unknown>] G8RC:%vreg574
        STD %vreg574<kill>, 14, %X1; mem:ST8[<unknown>] G8RC:%vreg574
-       %vreg575<def> = LD 0, <fi#25>; mem:LD8[<unknown>] G8RC:%vreg575
+       %vreg575<def> = LD 0, <fi#30>; mem:LD8[<unknown>] G8RC:%vreg575
        STD %vreg575<kill>, 16, %X1; mem:ST8[<unknown>] G8RC:%vreg575
        ADJCALLSTACKDOWN 112, %R1<imp-def,dead>, %R1<imp-use>
        %vreg576<def> = LBZ8 0, <fi#23>; mem:LD1[<unknown>](align=8)
G8RC:%vreg576
-       %vreg577<def> = LD 0, <fi#25>; mem:LD8[<unknown>] G8RC:%vreg577
-       %vreg578<def> = LD 0, <fi#30>; mem:LD8[<unknown>] G8RC:%vreg578
+       %vreg577<def> = LD 0, <fi#30>; mem:LD8[<unknown>] G8RC:%vreg577
+       %vreg578<def> = LD 0, <fi#25>; mem:LD8[<unknown>] G8RC:%vreg578
        %vreg579<def> = ADDI8 <fi#28>, 0; G8RC:%vreg579
        %vreg580<def> = ADDI8 <fi#26>, 0; G8RC:%vreg580
        %X3<def> = COPY %vreg579; G8RC:%vreg579

The reason for this difference is this piece of code:

  // This is a simple greedy algorithm for merging allocas. First, sort the
  // slots, placing the largest slots first. Next, perform an n^2 scan and
look
  // for disjoint slots. When you find disjoint slots, merge the samller
one
  // into the bigger one and update the live interval. Remove the small
alloca
  // and continue.

  // Sort the slots according to their size. Place unused slots at the end.
  std::sort(SortedSlots.begin(), SortedSlots.end(), SlotSizeSorter(MFI));

In general, there may be many stack slots of the same size, so the sort
order imposed by SlotSizeSorter is not total.  In such cases, std::sort has
nondeterministic output.  These differences in the sorted array then lead
to corresponding differences in choices of stack slots to merge.

The attached patch fixes this by using std::stable_sort instead.

Tested on powerpc64-linux with no regressions in test/ or
projects/test-suite/.   Fixes the vast majority of code differences between
stage-2 and stage-3 in a bootstrap on PowerPC (there are still two
unrelated differences left).

OK to commit?

Bye,
Ulrich

(See attached file: diff-llvm-stackslot-stablesort)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: diff-llvm-stackslot-stablesort
Type: application/octet-stream
Size: 614 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121115/1f7469e0/attachment.obj>