[PATCH] D68205: [ModuloSchedule] Peel out prologs and epilogs, generate actual code

James Molloy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 1 05:14:28 PDT 2019


jmolloy added a comment.

Hi Thomas,

I made an example to show how this is handled.

  %11:intregs = S2_addasl_rrri %7, %6, 1, post-instr-symbol <mcsymbol Stage-0_Cycle-0>
  %12:intregs = L2_loadruh_io %11, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
  %5:intregs = S2_storerh_pi %6, -2, %12, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
  ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0

We generate this code, annotated:

   <... prolog, boring ...>
  
  bb.3.b2 (address-taken):  // Kernel.
    successors: %bb.3(0x7c000000), %bb.10(0x04000000)
  
    %15:intregs = PHI %25, %bb.6, %11, %bb.3
    %17:intregs = PHI %26, %bb.6, %12, %bb.3
    %11:intregs = S2_addasl_rrri %7, %6, 1, post-instr-symbol <mcsymbol Stage-0_Cycle-0>
    %12:intregs = L2_loadruh_io %15, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
    dead %5:intregs = S2_storerh_pi %6, -2, %17, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
    ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0
    J2_jump %bb.10, implicit-def $pc
  
  bb.10.b2:  // Epilog 0, runs stage 2
    successors: %bb.9(0x80000000)
  
    %40:intregs = PHI %11, %bb.3, %25, %bb.6
    %41:intregs = PHI %12, %bb.3, %26, %bb.6
    dead %44:intregs = S2_storerh_pi %6, -2, %41, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
    J2_jump %bb.9, implicit-def $pc
  
  bb.9.b2:  // Start of Epilog 1, runs stage 1
    successors: %bb.8(0x80000000)
  
    %35:intregs = PHI %40, %bb.10, %20, %bb.5
    %38:intregs = L2_loadruh_io %35, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
    J2_jump %bb.8, implicit-def $pc
  
  bb.8.b2:  // Next stage of Epilog 1, runs stage 2
    successors: %bb.7(0x80000000)
  
    dead %34:intregs = S2_storerh_pi %6, -2, %38, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
    J2_jump %bb.7, implicit-def $pc

The key is that though E1 runs stages {1,2}, we *don't* create a block with both stages {1,2} enabled. This would cause the invalid code issue you mentioned. Instead, we expand this into *two* blocks. The first performs stage 1, the second stage 2 which consumes its input from stage 1.

That means we do generate superfluous epilog blocks, but these get merged by the control flow optimizer later.

That said, I'm not guaranteeing there are no bugs here. Perhaps the testcase you're thinking of distills to something more complex than my testcase?

Cheers,

James


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68205/new/

https://reviews.llvm.org/D68205





More information about the llvm-commits mailing list