<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - -no-phi-elim-live-out-early-exit (r231064) regresses generated code"
href="http://llvm.org/bugs/show_bug.cgi?id=22768">22768</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>-no-phi-elim-live-out-early-exit (r231064) regresses generated code
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Common Code Generator Code
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>djasper@google.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Specifically in
CodeGen/X86/coalescer-commute4.ll
CodeGen/X86/phys_subreg_coalesce-2.ll
CodeGen/X86/zlib-longest-match.ll
Copies are no coalesced properly. This seems to be an issue of the visitation
order in RegisterCoalescer. Looking at CodeGen/X86/coalescer-commute4.ll:
Without -no-phi-elim-live-out-early-exit:
# *** IR Dump Before Simple Register Coalescing ***:
# Machine code for function foo: Post SSA
Frame Objects:
fi#-3: size=4, align=8, fixed, at location [SP+12]
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]
fi#0: size=4, align=4, at location [SP+4]
0B BB#0: derived from LLVM BB %entry
16B %vreg7<def> = MOV32rm <fi#-3>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-3](align=8) GR32:%vreg7
32B %vreg8<def> = FsFLD0SS; FR32:%vreg8
48B TEST32rr %vreg7, %vreg7, %EFLAGS<imp-def>; GR32:%vreg7
64B JNE_1 <BB#1>, %EFLAGS<imp-use,kill>
Successors according to CFG: BB#4(12) BB#1(20)
80B BB#4:
Predecessors according to CFG: BB#0
96B %vreg16<def> = COPY %vreg8; FR32:%vreg16,%vreg8
112B JMP_1 <BB#3>
Successors according to CFG: BB#3
128B BB#1:
Predecessors according to CFG: BB#0
144B %vreg6<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-2] GR32:%vreg6
160B %vreg5<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg5
176B %vreg9<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg9
192B %vreg14<def> = COPY %vreg9; GR32_NOSP:%vreg14 GR32:%vreg9
208B %vreg15<def> = COPY %vreg8; FR32:%vreg15,%vreg8
Successors according to CFG: BB#2
224B BB#2: derived from LLVM BB %bb
Predecessors according to CFG: BB#2 BB#1
240B %vreg1<def> = COPY %vreg15; FR32:%vreg1,%vreg15
256B %vreg0<def> = COPY %vreg14; GR32_NOSP:%vreg0,%vreg14
272B %vreg10<def> = CVTSI2SSrm %vreg5, 4, %vreg0, 0, %noreg;
mem:LD4[%scevgep1] FR32:%vreg10 GR32:%vreg5 GR32_NOSP:%vreg0
288B %vreg11<def> = COPY %vreg10; FR32:%vreg11,%vreg10
304B %vreg11<def,tied1> = MULSSrm %vreg11<tied0>, %vreg6, 4, %vreg0,
0, %noreg; mem:LD4[%scevgep] FR32:%vreg11 GR32:%vreg6 GR32_NOSP:%vreg0
320B %vreg2<def> = COPY %vreg11; FR32:%vreg2,%vreg11
336B %vreg2<def,tied1> = ADDSSrr %vreg2<tied0>, %vreg1;
FR32:%vreg2,%vreg1
352B %vreg3<def> = COPY %vreg0; GR32_NOSP:%vreg3,%vreg0
368B %vreg3<def,tied1> = INC32r %vreg3<tied0>,
%EFLAGS<imp-def,dead>; GR32_NOSP:%vreg3
384B CMP32rr %vreg3, %vreg7, %EFLAGS<imp-def>; GR32_NOSP:%vreg3
GR32:%vreg7
400B %vreg14<def> = COPY %vreg3; GR32_NOSP:%vreg14,%vreg3
416B %vreg15<def> = COPY %vreg2; FR32:%vreg15,%vreg2
432B %vreg16<def> = COPY %vreg2; FR32:%vreg16,%vreg2
448B JB_1 <BB#2>, %EFLAGS<imp-use,kill>
464B JMP_1 <BB#3>
Successors according to CFG: BB#3(4) BB#2(124)
480B BB#3: derived from LLVM BB %bb23
Predecessors according to CFG: BB#2 BB#4
496B %vreg4<def> = COPY %vreg16; FR32:%vreg4,%vreg16
512B MOVSSmr <fi#0>, 1, %noreg, 0, %noreg, %vreg4;
mem:ST4[FixedStack0] FR32:%vreg4
528B %vreg13<def> = LD_Fp32m80 <fi#0>, 1, %noreg, 0, %noreg,
%FPSW<imp-def,dead>; mem:LD4[FixedStack0](align=16) RFP80:%vreg13
544B RETL %vreg13; RFP80:%vreg13
# End machine code for function foo.
With -no-phi-elim-live-out-early-exit:
# *** IR Dump Before Simple Register Coalescing ***:
# Machine code for function foo: Post SSA
Frame Objects:
fi#-3: size=4, align=8, fixed, at location [SP+12]
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]
fi#0: size=4, align=4, at location [SP+4]
0B BB#0: derived from LLVM BB %entry
16B %vreg7<def> = MOV32rm <fi#-3>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-3](align=8) GR32:%vreg7
32B %vreg8<def> = FsFLD0SS; FR32:%vreg8
48B TEST32rr %vreg7, %vreg7, %EFLAGS<imp-def>; GR32:%vreg7
64B JNE_1 <BB#1>, %EFLAGS<imp-use,kill>
Successors according to CFG: BB#4(12) BB#1(20)
80B BB#4:
Predecessors according to CFG: BB#0
96B %vreg16<def> = COPY %vreg8; FR32:%vreg16,%vreg8
112B JMP_1 <BB#3>
Successors according to CFG: BB#3
128B BB#1:
Predecessors according to CFG: BB#0
144B %vreg6<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-2] GR32:%vreg6
160B %vreg5<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg5
176B %vreg9<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg9
192B %vreg14<def> = COPY %vreg9; GR32_NOSP:%vreg14 GR32:%vreg9
208B %vreg15<def> = COPY %vreg8; FR32:%vreg15,%vreg8
Successors according to CFG: BB#2
224B BB#2: derived from LLVM BB %bb
Predecessors according to CFG: BB#2 BB#1
240B %vreg1<def> = COPY %vreg15; FR32:%vreg1,%vreg15
256B %vreg0<def> = COPY %vreg14; GR32_NOSP:%vreg0,%vreg14
272B %vreg10<def> = CVTSI2SSrm %vreg5, 4, %vreg0, 0, %noreg;
mem:LD4[%scevgep1] FR32:%vreg10 GR32:%vreg5 GR32_NOSP:%vreg0
288B %vreg11<def> = COPY %vreg10; FR32:%vreg11,%vreg10
304B %vreg11<def,tied1> = MULSSrm %vreg11<tied0>, %vreg6, 4, %vreg0,
0, %noreg; mem:LD4[%scevgep] FR32:%vreg11 GR32:%vreg6 GR32_NOSP:%vreg0
320B %vreg2<def> = COPY %vreg11; FR32:%vreg2,%vreg11
336B %vreg2<def,tied1> = ADDSSrr %vreg2<tied0>, %vreg1;
FR32:%vreg2,%vreg1
352B %vreg3<def> = COPY %vreg0; GR32_NOSP:%vreg3,%vreg0
368B %vreg3<def,tied1> = INC32r %vreg3<tied0>,
%EFLAGS<imp-def,dead>; GR32_NOSP:%vreg3
384B CMP32rr %vreg3, %vreg7, %EFLAGS<imp-def>; GR32_NOSP:%vreg3
GR32:%vreg7
400B %vreg14<def> = COPY %vreg3; GR32_NOSP:%vreg14,%vreg3
416B %vreg15<def> = COPY %vreg2; FR32:%vreg15,%vreg2
432B JB_1 <BB#2>, %EFLAGS<imp-use,kill>
Successors according to CFG: BB#5(4) BB#2(124)
448B BB#5:
Predecessors according to CFG: BB#2
464B %vreg16<def> = COPY %vreg2; FR32:%vreg16,%vreg2
Successors according to CFG: BB#3
480B BB#3: derived from LLVM BB %bb23
Predecessors according to CFG: BB#4 BB#5
496B %vreg4<def> = COPY %vreg16; FR32:%vreg4,%vreg16
512B MOVSSmr <fi#0>, 1, %noreg, 0, %noreg, %vreg4;
mem:ST4[FixedStack0] FR32:%vreg4
528B %vreg13<def> = LD_Fp32m80 <fi#0>, 1, %noreg, 0, %noreg,
%FPSW<imp-def,dead>; mem:LD4[FixedStack0](align=16) RFP80:%vreg13
544B RETL %vreg13; RFP80:%vreg13
# End machine code for function foo.
The problem is that RegisterCoalescer visits non-local copies first as
potentially local copies are easier to coalesce. With the change "%vreg15<def>
= COPY %vreg2; FR32:%vreg15,%vreg2" becomes a non-local copy as vreg2 is a
live-out. Thus, this copy is visited first and vreg2/vreg15 are coalesced. This
makes it impossible to later coalesce vreg1/vreg15, which is preferable.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>