[PATCH] D38128: Handle COPYs of physregs better (regalloc hints)

Tom Stellard via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 11 09:05:01 PDT 2017


tstellar added a comment.

In https://reviews.llvm.org/D38128#894550, @jonpa wrote:

> > Unfortunately, all the AMDGPU testcases that change from using vcc to a regular sgpr are regressions. Do you have any idea why this is happening?
>
> I found out that since AMDGPU is passing regalloc hints with type 0 (generic), those VCC hints got cleared. After thinking about it, I thought that it would still be good to let target pass hints of type 0 since otherwise target would always have to implement getRegAllocationHints() unnecessarily. I first thought that perhaps type-0 hints should only get cleared if they are not rediscovered (recomputed) in calculateSpillWeightAndHint(). That saved the VCC reg from being removed. But I then realized that it would not make sense to ever remove a target hint, even if it is type-0. It should not be added by target unless there is a special reason for doing so making it more important than other hints, even if it is a COPY hint with less weight than others (not sure that would happen). So I **removed the clearRegAllocationHints()**. This means target can always add a hint and expect it to be first in the hints vector.
>
> For AMDGPU, that meant a lot of the FAILs disappeared, which was nice. There are a few left:
>
> - a lot of v_cmp_eq_f32_e64 -> v_cmp_eq_f32_e32.  Is it OK?


The _e64 and _e32 suffixes say how many bits it takes to encode the instruction, so replacing _e64 with _e32
reduces code size and is an improvement.

I'll have to apply the patch and take a look at before/after assembly output for the rest of these tests.

> 
> 
> - CodeGen/AMDGPU/multilevel-break.ll:
> 
>   The SAVE_BREAK data-flow does not pass the test. Is this a regression?
> 
>   /home/ijonpan/llvm/llvm-dev/test/CodeGen/AMDGPU/multilevel-break.ll:32:13: error: expected string not found in input ; GCN-NEXT: s_or_b64 exec, exec, [[SAVE_BREAK]] ^ <stdin>:38:2: note: scanning from here s_or_b64 exec, exec, s[8:9] ^ <stdin>:38:2: note: with variable "SAVE_BREAK" equal to "s[0:1]" s_or_b64 exec, exec, s[8:9] ^
> 
>   Output with patch:
> 
> 
> 
>   B0_1:                                  ; %LOOP.outer
>                                           ; =>This Loop Header: Depth=1
>                                           ;     Child Loop BB0_2 Depth 2
>           s_mov_b64 s[4:5], 0
>   BB0_2:                                  ; %LOOP
>                                           ;   Parent Loop BB0_1 Depth=1
>                                           ; =>  This Inner Loop Header: Depth=2
>           v_cmp_lt_i32_e32 vcc, v0, v4
>           s_mov_b64 s[6:7], s[2:3]
>           s_and_saveexec_b64 s[0:1], vcc
>           s_xor_b64 s[8:9], exec, s[0:1]
>           ; mask branch BB0_4
>   BB0_3:                                  ; %ENDIF
>                                           ;   in Loop: Header=BB0_2 Depth=2
>           v_add_i32_e32 v0, vcc, 1, v0
>           v_cmp_eq_u32_e32 vcc, v5, v0
>           v_cmp_ne_u32_e64 s[0:1], v5, v0
>           s_or_b64 s[4:5], s[0:1], s[4:5]
>           s_or_b64 s[6:7], vcc, s[2:3]
>   BB0_4:                                  ; %Flow
>                                           ;   in Loop: Header=BB0_2 Depth=2
>           s_or_b64 exec, exec, s[8:9]
>           s_or_b64 s[4:5], s[8:9], s[4:5]
>           s_andn2_b64 exec, exec, s[4:5]
>           s_cbranch_execnz BB0_2
>   ; BB#5:                                 ; %Flow1
>                                           ;   in Loop: Header=BB0_1 Depth=1
>           s_or_b64 exec, exec, s[4:5]
>           s_or_b64 s[2:3], s[8:9], s[6:7]
>           s_andn2_b64 exec, exec, s[2:3]
>           s_cbranch_execnz BB0_1
>   ; BB#6:                                 ; %IF
>           s_endpgm
>   
> 
> 
> 
> 
> - sgpr-control-flow.ll / sgpr_if_else_valu_br
> 
>   This fails as the same register *is* used in the two adds (test case temporarily modified).
> 
>   Target has hinted %vreg17. On trunk, this disappears since the target has hinted with type 0. This is because a COPY is found, which is then the best hint and this replaces the target hint. That copy hint is later ignored since the regclasses do not match. So, without any hint, %vreg3 gets assigned to %SGRP0, which passes.
> 
>   However, with the patch the %vreg17 hint is kept first in order, and the found COPY hint is placed after it. So, strange enough, when the target hint actually survives, the test case fails.
> 
>   This instruction
> 
>   %vreg3<def> = S_ADD_I32 %vreg17, %vreg18, %SCC<imp-def,dead>; SReg_32_XM0:%vreg3 SReg_32_XM0_XEXEC:%vreg17,%vreg18
> 
>   is the cause for the hint in SIShrinkInstructions.cpp:
> 
>   MRI.setRegAllocationHint(Dest->getReg(), 0, Src0->getReg());
> 
>   It seems to me that target is doing something wrong here. Should this be a target-type hint with some extra check, perhaps?
> 
>   @Quentin: you approved the patch but now it has changed a bit per above...




https://reviews.llvm.org/D38128





More information about the llvm-commits mailing list