[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Ivan Llopard ivanllopard at gmail.com
Thu Oct 25 15:35:51 PDT 2012


You are very close! It looks like it extends vreg32 interval in BB#3 
from the vreg10 = vreg32 in 1) till the next copy vreg32 = vreg10 in 5 
without verifying that vreg10 doesn't have the same valno at 5).
Given your test case:

1) %vreg10<def> = COPY %vreg32; R600_Reg128:%vreg10,%vreg32
2) %vreg10:sel_z<def> = COPY %vreg47; R600_Reg128:%vreg10 R600_Reg32:%vreg47
3) %vreg10:sel_w<def> = COPY %vreg32:sel_w; R600_Reg128:%vreg10,%vreg32
...
4) %vreg47<def> = COPY %vreg32:sel_z; R600_Reg32:%vreg47 R600_Reg128:%vreg32
5) %vreg32<def> = COPY %vreg10; R600_Reg128:%vreg32,%vreg10

I think the bug appears when analyzing the copy at 5. If I didn't miss 
something, the intervals should be:

RHS = %vreg10 [976r,992r:0)[992r,1040r:1)[1040r,1120r:2)  0 at 976r 1 at 992r 
2 at 1040r
LHS = %vreg32 [416r,448B:1)[448B,704r:0)[880B,1104r:0)[1120r,1168B:2) 
0 at 448B-phi 1 at 416r 2 at 1120r

that's correct?

If so, I guess the logic in adjustCopiesBackFrom() is looking at the 
wrong copy (3) instead, which is coalescable, and continuing without 
realizing the problem. It seems to have an easy fix, I hope so :-).

Regards,
Ivan


On 25/10/2012 23:01, Vincent Lejeune wrote:
> Thank for your help. You're right, merging vreg32 and vreg48 is perfectly fine, sorry I missed that.
> I "brute force" debuged by adding MachineFunction dump after each join, I think I found the issue : it's when vreg32 and vreg10 are merged.
> vreg10 only appears in BB#3, and the join only occurs in BB#3 apparently even if vreg32 lives in the 4 machine blocks
> After joining, there are still vreg32 occurence in the machinefunction dump.
>
> Before, the MF dump is :
> _________________
> # Machine code for function main: Post SSA
> Function Live Ins: %T1_X in %vreg14, %T1_Y in %vreg15, %T1_Z in %vreg16, %T1_W in %vreg17
> Function Live Outs: %T1_W %T1_Z %T1_Y %T1_X %T2_W %T2_Z %T2_Y %T2_X
>
> BB#0: derived from LLVM BB %0
>      Live Ins: %T1_X %T1_Y %T1_Z %T1_W
> %vreg17<def> = COPY %T1_W; R600_TReg32:%vreg17
> %vreg16<def> = COPY %T1_Z; R600_TReg32:%vreg16
> %vreg15<def> = COPY %T1_Y; R600_TReg32:%vreg15
> %vreg14<def> = COPY %T1_X; R600_TReg32:%vreg14
> %vreg18<def> = COPY %C1_X; R600_Reg32:%vreg18
> %vreg19:sel_x<def,read-undef> = COPY %vreg14<kill>; R600_Reg128:%vreg19 R600_TReg32:%vreg14
> %vreg2<def> = COPY %C1_Y; R600_Reg32:%vreg2
> %vreg21:sel_x<def,read-undef> = COPY %vreg18<kill>; R600_Reg128:%vreg21 R600_Reg32:%vreg18
> %vreg23<def> = COPY %vreg19<kill>; R600_Reg128:%vreg23,%vreg19
> %vreg23:sel_y<def> = COPY %vreg15<kill>; R600_Reg128:%vreg23 R600_TReg32:%vreg15
> %vreg24<def> = COPY %vreg21<kill>; R600_Reg128:%vreg24,%vreg21
> %vreg24:sel_y<def> = COPY %vreg2; R600_Reg128:%vreg24 R600_Reg32:%vreg2
> %vreg25<def> = COPY %C1_Z; R600_Reg32:%vreg25
> %vreg26<def> = COPY %vreg23<kill>; R600_Reg128:%vreg26,%vreg23
> %vreg26:sel_z<def> = COPY %vreg16<kill>; R600_Reg128:%vreg26 R600_TReg32:%vreg16
> %vreg27<def> = COPY %vreg24<kill>; R600_Reg128:%vreg27,%vreg24
> %vreg27:sel_z<def> = COPY %vreg25<kill>; R600_Reg128:%vreg27 R600_Reg32:%vreg25
> %vreg28<def> = COPY %C1_W; R600_Reg32:%vreg28
> %vreg3<def> = COPY %vreg27<kill>; R600_Reg128:%vreg3,%vreg27
> %vreg3:sel_w<def> = COPY %vreg28<kill>; R600_Reg128:%vreg3 R600_Reg32:%vreg28
> %vreg1<def> = COPY %vreg26<kill>; R600_Reg128:%vreg1,%vreg26
> %vreg1:sel_w<def> = COPY %vreg17<kill>; R600_Reg128:%vreg1 R600_TReg32:%vreg17
> %vreg13<def> = MOV 1, 0, 0, 0, %ALU_LITERAL_X, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg13
> %vreg0<def> = COPY %C0_X; R600_Reg32:%vreg0
> %vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
> %vreg32<def> = COPY %vreg3<kill>; R600_Reg128:%vreg32,%vreg3
> %vreg49<def> = COPY %vreg13<kill>; R600_Reg32:%vreg49,%vreg13
>      Successors according to CFG: BB#1
>
> BB#1: derived from LLVM BB %25
>      Predecessors according to CFG: BB#0 BB#3
> %vreg30<def> = SETGT_INT 0, 0, 1, 0, 0, 0, %vreg0, 0, 0, 0, %vreg49, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg30,%vreg0,%vreg49
> %PREDICATE_BIT<def> = PRED_X %vreg30, 152, 16; R600_Reg32:%vreg30
> JUMP <BB#3>, pred:%PREDICATE_BIT
> JUMP <BB#2>, pred:%noreg
>      Successors according to CFG: BB#2(4) BB#3(124)
>
> BB#2: derived from LLVM BB %31
>      Predecessors according to CFG: BB#1
> %vreg39<def> = COPY %vreg32:sel_x; R600_Reg32:%vreg39 R600_Reg128:%vreg32
> %T2_X<def> = COPY %vreg39<kill>; R600_Reg32:%vreg39
> %vreg40<def> = COPY %vreg32:sel_y; R600_Reg32:%vreg40 R600_Reg128:%vreg32
> %T2_Y<def> = COPY %vreg40<kill>; R600_Reg32:%vreg40
> %vreg41<def> = COPY %vreg32:sel_z; R600_Reg32:%vreg41 R600_Reg128:%vreg32
> %T2_Z<def> = COPY %vreg41<kill>; R600_Reg32:%vreg41
> %vreg42<def> = COPY %vreg32:sel_w; R600_Reg32:%vreg42 R600_Reg128:%vreg32
> %T2_W<def> = COPY %vreg42<kill>; R600_Reg32:%vreg42
> %vreg43<def> = COPY %vreg1:sel_x; R600_Reg32:%vreg43 R600_Reg128:%vreg1
> %T1_X<def> = COPY %vreg43<kill>; R600_Reg32:%vreg43
> %vreg44<def> = COPY %vreg1:sel_y; R600_Reg32:%vreg44 R600_Reg128:%vreg1
> %T1_Y<def> = COPY %vreg44<kill>; R600_Reg32:%vreg44
> %vreg45<def> = COPY %vreg1:sel_z; R600_Reg32:%vreg45 R600_Reg128:%vreg1
> %T1_Z<def> = COPY %vreg45<kill>; R600_Reg32:%vreg45
> %vreg46<def> = COPY %vreg1:sel_w<kill>; R600_Reg32:%vreg46 R600_Reg128:%vreg1
> %T1_W<def> = COPY %vreg46<kill>; R600_Reg32:%vreg46
> RETURN %T1_W<imp-use>, %T1_Z<imp-use>, %T1_Y<imp-use>, %T1_X<imp-use>, %T2_W<imp-use,kill>, %T2_Z<imp-use,kill>, %T2_Y<imp-use,kill>, %T2_X<imp-use,kill>
>
> BB#3: derived from LLVM BB %41
>      Predecessors according to CFG: BB#1
> %vreg10<def> = COPY %vreg32; R600_Reg128:%vreg10,%vreg32
> %vreg10:sel_z<def> = COPY %vreg47; R600_Reg128:%vreg10 R600_Reg32:%vreg47
> %vreg10:sel_w<def> = COPY %vreg32:sel_w; R600_Reg128:%vreg10,%vreg32
> %vreg38<def> = MOV 1, 0, 0, 0, %ALU_LITERAL_X, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 1; R600_Reg32:%vreg38
> %vreg11<def> = ADD_INT 0, 0, 1, 0, 0, 0, %vreg49, 0, 0, 0, %vreg38<kill>, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg11,%vreg49,%vreg38
> %vreg47<def> = COPY %vreg32:sel_z; R600_Reg32:%vreg47 R600_Reg128:%vreg32
> %vreg32<def> = COPY %vreg10; R600_Reg128:%vreg32,%vreg10
> %vreg49<def> = COPY %vreg11<kill>; R600_Reg32:%vreg49,%vreg11
> JUMP <BB#1>, pred:%noreg
>      Successors according to CFG: BB#1
>
> # End machine code for function main.
>
> ___________________
>
> After it is :
> _____________________________
> # Machine code for function main: Post SSA
> Function Live Ins: %T1_X in %vreg14, %T1_Y in %vreg15, %T1_Z in %vreg16, %T1_W in %vreg17
> Function Live Outs: %T1_W %T1_Z %T1_Y %T1_X %T2_W %T2_Z %T2_Y %T2_X
>
> BB#0: derived from LLVM BB %0
>      Live Ins: %T1_X %T1_Y %T1_Z %T1_W
> %vreg17<def> = COPY %T1_W; R600_TReg32:%vreg17
> %vreg16<def> = COPY %T1_Z; R600_TReg32:%vreg16
> %vreg15<def> = COPY %T1_Y; R600_TReg32:%vreg15
> %vreg14<def> = COPY %T1_X; R600_TReg32:%vreg14
> %vreg18<def> = COPY %C1_X; R600_Reg32:%vreg18
> %vreg19:sel_x<def,read-undef> = COPY %vreg14<kill>; R600_Reg128:%vreg19 R600_TReg32:%vreg14
> %vreg2<def> = COPY %C1_Y; R600_Reg32:%vreg2
> %vreg21:sel_x<def,read-undef> = COPY %vreg18<kill>; R600_Reg128:%vreg21 R600_Reg32:%vreg18
> %vreg23<def> = COPY %vreg19<kill>; R600_Reg128:%vreg23,%vreg19
> %vreg23:sel_y<def> = COPY %vreg15<kill>; R600_Reg128:%vreg23 R600_TReg32:%vreg15
> %vreg24<def> = COPY %vreg21<kill>; R600_Reg128:%vreg24,%vreg21
> %vreg24:sel_y<def> = COPY %vreg2; R600_Reg128:%vreg24 R600_Reg32:%vreg2
> %vreg25<def> = COPY %C1_Z; R600_Reg32:%vreg25
> %vreg26<def> = COPY %vreg23<kill>; R600_Reg128:%vreg26,%vreg23
> %vreg26:sel_z<def> = COPY %vreg16<kill>; R600_Reg128:%vreg26 R600_TReg32:%vreg16
> %vreg27<def> = COPY %vreg24<kill>; R600_Reg128:%vreg27,%vreg24
> %vreg27:sel_z<def> = COPY %vreg25<kill>; R600_Reg128:%vreg27 R600_Reg32:%vreg25
> %vreg28<def> = COPY %C1_W; R600_Reg32:%vreg28
> %vreg3<def> = COPY %vreg27<kill>; R600_Reg128:%vreg3,%vreg27
> %vreg3:sel_w<def> = COPY %vreg28<kill>; R600_Reg128:%vreg3 R600_Reg32:%vreg28
> %vreg1<def> = COPY %vreg26<kill>; R600_Reg128:%vreg1,%vreg26
> %vreg1:sel_w<def> = COPY %vreg17<kill>; R600_Reg128:%vreg1 R600_TReg32:%vreg17
> %vreg13<def> = MOV 1, 0, 0, 0, %ALU_LITERAL_X, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg13
> %vreg0<def> = COPY %C0_X; R600_Reg32:%vreg0
> %vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
> %vreg32<def> = COPY %vreg3<kill>; R600_Reg128:%vreg32,%vreg3
> %vreg49<def> = COPY %vreg13<kill>; R600_Reg32:%vreg49,%vreg13
>      Successors according to CFG: BB#1
>
> BB#1: derived from LLVM BB %25
>      Predecessors according to CFG: BB#0 BB#3
> %vreg30<def> = SETGT_INT 0, 0, 1, 0, 0, 0, %vreg0, 0, 0, 0, %vreg49, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg30,%vreg0,%vreg49
> %PREDICATE_BIT<def> = PRED_X %vreg30, 152, 16; R600_Reg32:%vreg30
> JUMP <BB#3>, pred:%PREDICATE_BIT
> JUMP <BB#2>, pred:%noreg
>      Successors according to CFG: BB#2(4) BB#3(124)
>
> BB#2: derived from LLVM BB %31
>      Predecessors according to CFG: BB#1
> %vreg39<def> = COPY %vreg32:sel_x; R600_Reg32:%vreg39 R600_Reg128:%vreg32
> %T2_X<def> = COPY %vreg39<kill>; R600_Reg32:%vreg39
> %vreg40<def> = COPY %vreg32:sel_y; R600_Reg32:%vreg40 R600_Reg128:%vreg32
> %T2_Y<def> = COPY %vreg40<kill>; R600_Reg32:%vreg40
> %vreg41<def> = COPY %vreg32:sel_z; R600_Reg32:%vreg41 R600_Reg128:%vreg32
> %T2_Z<def> = COPY %vreg41<kill>; R600_Reg32:%vreg41
> %vreg42<def> = COPY %vreg32:sel_w; R600_Reg32:%vreg42 R600_Reg128:%vreg32
> %T2_W<def> = COPY %vreg42<kill>; R600_Reg32:%vreg42
> %vreg43<def> = COPY %vreg1:sel_x; R600_Reg32:%vreg43 R600_Reg128:%vreg1
> %T1_X<def> = COPY %vreg43<kill>; R600_Reg32:%vreg43
> %vreg44<def> = COPY %vreg1:sel_y; R600_Reg32:%vreg44 R600_Reg128:%vreg1
> %T1_Y<def> = COPY %vreg44<kill>; R600_Reg32:%vreg44
> %vreg45<def> = COPY %vreg1:sel_z; R600_Reg32:%vreg45 R600_Reg128:%vreg1
> %T1_Z<def> = COPY %vreg45<kill>; R600_Reg32:%vreg45
> %vreg46<def> = COPY %vreg1:sel_w<kill>; R600_Reg32:%vreg46 R600_Reg128:%vreg1
> %T1_W<def> = COPY %vreg46<kill>; R600_Reg32:%vreg46
> RETURN %T1_W<imp-use>, %T1_Z<imp-use>, %T1_Y<imp-use>, %T1_X<imp-use>, %T2_W<imp-use,kill>, %T2_Z<imp-use,kill>, %T2_Y<imp-use,kill>, %T2_X<imp-use,kill>
>
> BB#3: derived from LLVM BB %41
>      Predecessors according to CFG: BB#1
> %vreg10<def> = COPY %vreg32; R600_Reg128:%vreg10,%vreg32
> %vreg10:sel_z<def> = COPY %vreg47; R600_Reg128:%vreg10 R600_Reg32:%vreg47
> %vreg10:sel_w<def,dead> = COPY %vreg32:sel_w; R600_Reg128:%vreg10,%vreg32
> %vreg38<def> = MOV 1, 0, 0, 0, %ALU_LITERAL_X, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 1; R600_Reg32:%vreg38
> %vreg11<def> = ADD_INT 0, 0, 1, 0, 0, 0, %vreg49, 0, 0, 0, %vreg38<kill>, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg11,%vreg49,%vreg38
> %vreg47<def> = COPY %vreg32:sel_z; R600_Reg32:%vreg47 R600_Reg128:%vreg32
> %vreg49<def> = COPY %vreg11<kill>; R600_Reg32:%vreg49,%vreg11
> JUMP <BB#1>, pred:%noreg
>      Successors according to CFG: BB#1
>
> # End machine code for function main.
> ___________________________________________
>
>
> ----- Mail original -----
>> De : Ivan Llopard <ivanllopard at gmail.com>
>> À : Vincent Lejeune <vljn at ovi.com>
>> Cc : "llvmdev at cs.uiuc.edu" <llvmdev at cs.uiuc.edu>
>> Envoyé le : Jeudi 25 octobre 2012 21h54
>> Objet : Re: [LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
>>
>> Hi Vincent,
>>
>> On 25/10/2012 18:14, Vincent Lejeune wrote:
>>>   When examining the debug output of regalloc, it seems that joining 32bits
>> reg also joins 128 parent reg.
>>>
>>>   If I look at the :
>>>   %vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34
>> R600_Reg128:%vreg6
>>>
>>>   instructions ; it gets joined to :
>>>   928B%vreg34<def> = COPY %vreg48:sel_y;
>>>
>>>   when vreg6 and vreg48 are joined. It's right.
>>>
>>>   But joining the following copy
>>>
>>>   912B%vreg32:sel_x<def,read-undef> = COPY %vreg48:sel_x;
>> R600_Reg128:%vreg32,%vreg48
>>>
>>>   updates it to
>>>   928B%vreg34<def> = COPY %vreg32:sel_y; R600_Reg32:%vreg34
>> R600_Reg128:%vreg32
>>>
>>>   which is wrong. vreg32:sel_y is undef.
>>
>> Well, that seems correct to me. Following the code and debug output, the
>> joining is going from vreg6->vreg48->vreg32. vreg6 is defined at BB0, so
>> it will be vreg32 and its sel_y subreg also.
>> Try to run the verifier with -verify-coalescing.
>> If you have implemented the hooks for the coalescer in your
>> TargetRegisterInfo you can also try to disable them to verify the
>> implementation. Hope this helps to spot the problem.
>>
>> Ivan
>>
>>>
>>>   Regards,
>>>   Vincent
>>>
>>



More information about the llvm-dev mailing list