[LLVMdev] Unnecessary reload of register

Thu Aug 28 11:19:38 PDT 2014

Hi Markus,

> On Aug 28, 2014, at 9:57 AM, Markus Timpl <tima0900 at googlemail.com> wrote:
> 
> Hi all,
> I'm trying to figure out why llvm(llvm 3.1, can't easilly try it with other version) inserts an unnecessary load for one register.
>  
> The following is the code before instruction selection:
> ----------------------------------------------------------------------------
>     %call = tail call i32 @_Z7zahlIntv()
>     %0 = inttoptr i32 %call to i32*
>     %1 = load i32* %0, align 4, !tbaa !0
>     %arrayidx1 = getelementptr inbounds i32* %0, i32 1
>     %2 = load i32* %arrayidx1, align 4, !tbaa !0
>     %mul = mul nsw i32 %2, %1
>     ret i32 %mul
> ----------------------------------------------------------------------------
>  
> After instruction selection I get the following:
>  
> ----------------------------------------------------------------------------
>   BB#0: derived from LLVM BB %entry
>    ADJCALLSTACKDOWN 8, %SP<imp-def,dead>, %SP<imp-use>
>    JSUB <ga:@_Z7zahlIntv>, 1, <fi#-2>, 0, <fi#0>, 0, ...
>    ADJCALLSTACKUP 8, 0, %SP<imp-def,dead>, %SP<imp-use>
>    %vreg0<def> = LDrid <fi#-2>, 4; mem:LD4[FixedStack-2] AkkuDRegs:%vreg0
>    %vreg2<def> = COPY %vreg0; PointerAdrRegs:%vreg2 AkkuDRegs:%vreg0
>    %vreg1<def> = LDridAddr %vreg2, 4; mem:LD4[%arrayidx1](tbaa=!"int") AkkuDRegs:%vreg1 PointerAdrRegs:%vreg2
>    %vreg4<def> = COPY %vreg0; PointerAdrRegs:%vreg4 AkkuDRegs:%vreg0
>    %vreg3<def> = MULINTDLDridAddr %vreg1<kill>, %vreg4, 0; mem:LD4[%0](tbaa=!"int") AkkuDRegs:%vreg3,%vreg1 PointerAdrRegs:%vreg4
>    STrid <fi#-, 0, %vreg3<kill>; mem:ST4[FixedStack-1] AkkuDRegs:%vreg3
> ----------------------------------------------------------------------------
>  
> So far everything seems to look fine. But I don't understand why there is a %vreg4 as it has the same value as %vreg2.

I’d suggest that you check how the lowering is actually done for your target to end up with those two copies of the same value.

>  
> At the end I get the following:
>  
> ----------------------------------------------------------------------------
>    JSUB <ga:@_Z7zahlIntv>, 1, %SP, 24, %SP, 0, %AKKU1D<imp-def,dead>, %AR2<imp-def,dead>
>    %AKKU1D<def> = LDrid %SP, 28; mem:LD4[FixedStack-2]
>    STrid %SP, 8, %AKKU1D<kill>
>    %AKKU1D<def> = LDrid %SP, 8
>    %AR2<def> = LAR2d %AKKU1D<kill>
>    %AKKU1D<def> = LDridAddr %AR2<kill>, 4; mem:LD4[%arrayidx1](tbaa=!"int")
>    STrid %SP, 12, %AKKU1D<kill>
>    %AKKU1D<def> = LDrid %SP, 8
>    %AR2<def> = LAR2d %AKKU1D<kill>
>    %AKKU1D<def> = LDrid %SP, 12
>    %AKKU1D<def> = MULINTDLDridAddr %AKKU1D<kill>, %AR2<kill>, 0; mem:LD4[%0](tbaa=!"int")
>    STrid %SP, 0, %AKKU1D<kill>; mem:ST4[FixedStack-1]
> ----------------------------------------------------------------------------
>  
> It should be obvious that the second "  %AR2<def> = LAR2d %AKKU1D<kill>" is unneccessary. I've no idea why llvm thinks it needs to fill the register with the prober value again.

The thing is that llvm does not keep track of the values. It sees three virtual registers: vreg0, vreg2, and vreg4 and the only thing special about them is that vreg0 and vreg2 are copy-related, same for vreg4 and vreg0. Other than with the register coalescer, these values are not attempted to be merged. Perhaps you could check the output of the register coalescer to see why it is not merging them (-debug-only regalloc), though I suspect that it is because PointerAdrRegs and AkkuDRegs are not coalescable.
If that is the case, we do have an optimization in the peephole optimizer that rewrites the COPYs to avoid cross register file copies. It wouldn’t catch this case[1], but it is possible to teach it.

Another option would be to check why MachineCSE does not catch this.

Anyway, the best way to avoid these redundant copies is not to emit them in the first place :).

[1] The copy rewriting works bottom-up:
A = b
c = A
=>
A = b
c = b
but what you want here is a bit different (you look for all the alternative sources):
A = b
C= b
=>
A = b
C = A
Note: Uppercase and lowercase registers are in different register file.

Thanks,
-Quentin

> No instruction in the whole block is destroying the value in AR2. Might it be because %AR2 is marked kill in the LDridAddr instruction? If so, how can I mark the instruction that it isn't destroying the register value? The following is the definition of the instruction:
>  
> ----------------------------------------------------------------------------
> def MEMirPtr : Operand<i32> {
> 
> let PrintMethod = "printMemOperand";
> 
> let MIOperandInfo = (ops PointerAdrRegs, i64imm);
> 
> }
> 
> class AwlInst<dag outs, dag ins, string asmstr, list<dag> pattern, int size = 0> : Instruction {
> 
> 
> field bits<32> Inst;
> 
> 
> let Namespace = "Awl0";
> 
> 
> dag OutOperandList = outs;
> 
> 
> dag InOperandList = ins;
> 
> 
> let AsmString = asmstr;
> 
> 
> let Pattern = pattern;
> 
> 
> let Size = size;
> 
> 
> }
> 
>  
> def LDridAddr : AwlInst<(outs AkkuDRegs:$dst), (ins MEMirPtr:$addr),
> 
> "L D$addr; \t// LDridAddr $addr -> $dst",
> 
> [(set AkkuDRegs:$dst, (load addraddr:$addr))], 4>;
> 
> ----------------------------------------------------------------------------
> 
>  
>  
> The complete output of print-after-all is attached to this mail.
> 
> Any hints are appreciated.
> 
> Thanks in advance,
> 
> Markus
> 
> <printafterallOutput.txt>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140828/6353ae99/attachment.html>