[LLVMbugs] [Bug 22768] New: -no-phi-elim-live-out-early-exit (r231064) regresses generated code
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Tue Mar 3 02:25:12 PST 2015
http://llvm.org/bugs/show_bug.cgi?id=22768
Bug ID: 22768
Summary: -no-phi-elim-live-out-early-exit (r231064) regresses
generated code
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Common Code Generator Code
Assignee: unassignedbugs at nondot.org
Reporter: djasper at google.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
Specifically in
CodeGen/X86/coalescer-commute4.ll
CodeGen/X86/phys_subreg_coalesce-2.ll
CodeGen/X86/zlib-longest-match.ll
Copies are no coalesced properly. This seems to be an issue of the visitation
order in RegisterCoalescer. Looking at CodeGen/X86/coalescer-commute4.ll:
Without -no-phi-elim-live-out-early-exit:
# *** IR Dump Before Simple Register Coalescing ***:
# Machine code for function foo: Post SSA
Frame Objects:
fi#-3: size=4, align=8, fixed, at location [SP+12]
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]
fi#0: size=4, align=4, at location [SP+4]
0B BB#0: derived from LLVM BB %entry
16B %vreg7<def> = MOV32rm <fi#-3>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-3](align=8) GR32:%vreg7
32B %vreg8<def> = FsFLD0SS; FR32:%vreg8
48B TEST32rr %vreg7, %vreg7, %EFLAGS<imp-def>; GR32:%vreg7
64B JNE_1 <BB#1>, %EFLAGS<imp-use,kill>
Successors according to CFG: BB#4(12) BB#1(20)
80B BB#4:
Predecessors according to CFG: BB#0
96B %vreg16<def> = COPY %vreg8; FR32:%vreg16,%vreg8
112B JMP_1 <BB#3>
Successors according to CFG: BB#3
128B BB#1:
Predecessors according to CFG: BB#0
144B %vreg6<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-2] GR32:%vreg6
160B %vreg5<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg5
176B %vreg9<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg9
192B %vreg14<def> = COPY %vreg9; GR32_NOSP:%vreg14 GR32:%vreg9
208B %vreg15<def> = COPY %vreg8; FR32:%vreg15,%vreg8
Successors according to CFG: BB#2
224B BB#2: derived from LLVM BB %bb
Predecessors according to CFG: BB#2 BB#1
240B %vreg1<def> = COPY %vreg15; FR32:%vreg1,%vreg15
256B %vreg0<def> = COPY %vreg14; GR32_NOSP:%vreg0,%vreg14
272B %vreg10<def> = CVTSI2SSrm %vreg5, 4, %vreg0, 0, %noreg;
mem:LD4[%scevgep1] FR32:%vreg10 GR32:%vreg5 GR32_NOSP:%vreg0
288B %vreg11<def> = COPY %vreg10; FR32:%vreg11,%vreg10
304B %vreg11<def,tied1> = MULSSrm %vreg11<tied0>, %vreg6, 4, %vreg0,
0, %noreg; mem:LD4[%scevgep] FR32:%vreg11 GR32:%vreg6 GR32_NOSP:%vreg0
320B %vreg2<def> = COPY %vreg11; FR32:%vreg2,%vreg11
336B %vreg2<def,tied1> = ADDSSrr %vreg2<tied0>, %vreg1;
FR32:%vreg2,%vreg1
352B %vreg3<def> = COPY %vreg0; GR32_NOSP:%vreg3,%vreg0
368B %vreg3<def,tied1> = INC32r %vreg3<tied0>,
%EFLAGS<imp-def,dead>; GR32_NOSP:%vreg3
384B CMP32rr %vreg3, %vreg7, %EFLAGS<imp-def>; GR32_NOSP:%vreg3
GR32:%vreg7
400B %vreg14<def> = COPY %vreg3; GR32_NOSP:%vreg14,%vreg3
416B %vreg15<def> = COPY %vreg2; FR32:%vreg15,%vreg2
432B %vreg16<def> = COPY %vreg2; FR32:%vreg16,%vreg2
448B JB_1 <BB#2>, %EFLAGS<imp-use,kill>
464B JMP_1 <BB#3>
Successors according to CFG: BB#3(4) BB#2(124)
480B BB#3: derived from LLVM BB %bb23
Predecessors according to CFG: BB#2 BB#4
496B %vreg4<def> = COPY %vreg16; FR32:%vreg4,%vreg16
512B MOVSSmr <fi#0>, 1, %noreg, 0, %noreg, %vreg4;
mem:ST4[FixedStack0] FR32:%vreg4
528B %vreg13<def> = LD_Fp32m80 <fi#0>, 1, %noreg, 0, %noreg,
%FPSW<imp-def,dead>; mem:LD4[FixedStack0](align=16) RFP80:%vreg13
544B RETL %vreg13; RFP80:%vreg13
# End machine code for function foo.
With -no-phi-elim-live-out-early-exit:
# *** IR Dump Before Simple Register Coalescing ***:
# Machine code for function foo: Post SSA
Frame Objects:
fi#-3: size=4, align=8, fixed, at location [SP+12]
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]
fi#0: size=4, align=4, at location [SP+4]
0B BB#0: derived from LLVM BB %entry
16B %vreg7<def> = MOV32rm <fi#-3>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-3](align=8) GR32:%vreg7
32B %vreg8<def> = FsFLD0SS; FR32:%vreg8
48B TEST32rr %vreg7, %vreg7, %EFLAGS<imp-def>; GR32:%vreg7
64B JNE_1 <BB#1>, %EFLAGS<imp-use,kill>
Successors according to CFG: BB#4(12) BB#1(20)
80B BB#4:
Predecessors according to CFG: BB#0
96B %vreg16<def> = COPY %vreg8; FR32:%vreg16,%vreg8
112B JMP_1 <BB#3>
Successors according to CFG: BB#3
128B BB#1:
Predecessors according to CFG: BB#0
144B %vreg6<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-2] GR32:%vreg6
160B %vreg5<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg5
176B %vreg9<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg9
192B %vreg14<def> = COPY %vreg9; GR32_NOSP:%vreg14 GR32:%vreg9
208B %vreg15<def> = COPY %vreg8; FR32:%vreg15,%vreg8
Successors according to CFG: BB#2
224B BB#2: derived from LLVM BB %bb
Predecessors according to CFG: BB#2 BB#1
240B %vreg1<def> = COPY %vreg15; FR32:%vreg1,%vreg15
256B %vreg0<def> = COPY %vreg14; GR32_NOSP:%vreg0,%vreg14
272B %vreg10<def> = CVTSI2SSrm %vreg5, 4, %vreg0, 0, %noreg;
mem:LD4[%scevgep1] FR32:%vreg10 GR32:%vreg5 GR32_NOSP:%vreg0
288B %vreg11<def> = COPY %vreg10; FR32:%vreg11,%vreg10
304B %vreg11<def,tied1> = MULSSrm %vreg11<tied0>, %vreg6, 4, %vreg0,
0, %noreg; mem:LD4[%scevgep] FR32:%vreg11 GR32:%vreg6 GR32_NOSP:%vreg0
320B %vreg2<def> = COPY %vreg11; FR32:%vreg2,%vreg11
336B %vreg2<def,tied1> = ADDSSrr %vreg2<tied0>, %vreg1;
FR32:%vreg2,%vreg1
352B %vreg3<def> = COPY %vreg0; GR32_NOSP:%vreg3,%vreg0
368B %vreg3<def,tied1> = INC32r %vreg3<tied0>,
%EFLAGS<imp-def,dead>; GR32_NOSP:%vreg3
384B CMP32rr %vreg3, %vreg7, %EFLAGS<imp-def>; GR32_NOSP:%vreg3
GR32:%vreg7
400B %vreg14<def> = COPY %vreg3; GR32_NOSP:%vreg14,%vreg3
416B %vreg15<def> = COPY %vreg2; FR32:%vreg15,%vreg2
432B JB_1 <BB#2>, %EFLAGS<imp-use,kill>
Successors according to CFG: BB#5(4) BB#2(124)
448B BB#5:
Predecessors according to CFG: BB#2
464B %vreg16<def> = COPY %vreg2; FR32:%vreg16,%vreg2
Successors according to CFG: BB#3
480B BB#3: derived from LLVM BB %bb23
Predecessors according to CFG: BB#4 BB#5
496B %vreg4<def> = COPY %vreg16; FR32:%vreg4,%vreg16
512B MOVSSmr <fi#0>, 1, %noreg, 0, %noreg, %vreg4;
mem:ST4[FixedStack0] FR32:%vreg4
528B %vreg13<def> = LD_Fp32m80 <fi#0>, 1, %noreg, 0, %noreg,
%FPSW<imp-def,dead>; mem:LD4[FixedStack0](align=16) RFP80:%vreg13
544B RETL %vreg13; RFP80:%vreg13
# End machine code for function foo.
The problem is that RegisterCoalescer visits non-local copies first as
potentially local copies are easier to coalesce. With the change "%vreg15<def>
= COPY %vreg2; FR32:%vreg15,%vreg2" becomes a non-local copy as vreg2 is a
live-out. Thus, this copy is visited first and vreg2/vreg15 are coalesced. This
makes it impossible to later coalesce vreg1/vreg15, which is preferable.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150303/4ba234c8/attachment.html>
More information about the llvm-bugs
mailing list