[PATCH] D31821: Remove redundant copy in recurrences

Taewook Oh via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 10 11:06:06 PDT 2017


twoh created this revision.

If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code,

  BB#1: ; Loop Header
    %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
    ...
  
  BB#6: ; Loop Latch
    %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
    %vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0
    %vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10
    CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
    %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
    JL_1 <BB#1>, %EFLAGS<imp-use,kill>

Existing two-address generation pass generates following code:

  BB#1:
    %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
    ...
  
  BB#6:
      Predecessors according to CFG: BB#5 BB#4
    %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
    %vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1
    %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0
    %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
    %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
    CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
    %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
    JL_1 <BB#1>, %EFLAGS<imp-use,kill>
    JMP_1 <BB#7>

This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1:

  .LBB0_6:
    addl  %esi, %edi
    addl  %ebx, %edi
    cmpl  $10, %edi
    movl  %edi, %esi
    jl  .LBB0_1

This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes

  BB#1:
  
  BB#6: derived from LLVM BB %bb7
      Predecessors according to CFG: BB#5 BB#4
    %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
    %vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0
    %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1
    %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
    %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
    CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
    %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
    JL_1 <BB#1>, %EFLAGS<imp-use,kill>
    JMP_1 <BB#7>

and the final assembly does not have redundant copy:

  .LBB0_6:
    addl  %edi, %eax
    addl  %ebx, %eax
    cmpl  $10, %eax
    jl  .LBB0_1


https://reviews.llvm.org/D31821

Files:
  lib/CodeGen/TwoAddressInstructionPass.cpp
  test/CodeGen/X86/twoaddr-recurrence.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D31821.94547.patch
Type: text/x-patch
Size: 11448 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170410/1b49d92c/attachment.bin>


More information about the llvm-commits mailing list