[PATCH] D68205: [ModuloSchedule] Peel out prologs and epilogs, generate actual code

Thomas Raoux via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 1 09:37:51 PDT 2019


ThomasRaoux accepted this revision.
ThomasRaoux added a comment.
This revision is now accepted and ready to land.

In D68205#1689574 <https://reviews.llvm.org/D68205#1689574>, @jmolloy wrote:

> Hi Thomas,
>
> I made an example to show how this is handled.
>
>   %11:intregs = S2_addasl_rrri %7, %6, 1, post-instr-symbol <mcsymbol Stage-0_Cycle-0>
>   %12:intregs = L2_loadruh_io %11, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
>   %5:intregs = S2_storerh_pi %6, -2, %12, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
>   ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0
>   
>
> We generate this code, annotated:
>
>    <... prolog, boring ...>
>   
>   bb.3.b2 (address-taken):  // Kernel.
>     successors: %bb.3(0x7c000000), %bb.10(0x04000000)
>   
>     %15:intregs = PHI %25, %bb.6, %11, %bb.3
>     %17:intregs = PHI %26, %bb.6, %12, %bb.3
>     %11:intregs = S2_addasl_rrri %7, %6, 1, post-instr-symbol <mcsymbol Stage-0_Cycle-0>
>     %12:intregs = L2_loadruh_io %15, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
>     dead %5:intregs = S2_storerh_pi %6, -2, %17, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
>     ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0
>     J2_jump %bb.10, implicit-def $pc
>   
>   bb.10.b2:  // Epilog 0, runs stage 2
>     successors: %bb.9(0x80000000)
>   
>     %40:intregs = PHI %11, %bb.3, %25, %bb.6
>     %41:intregs = PHI %12, %bb.3, %26, %bb.6
>     dead %44:intregs = S2_storerh_pi %6, -2, %41, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
>     J2_jump %bb.9, implicit-def $pc
>   
>   bb.9.b2:  // Start of Epilog 1, runs stage 1
>     successors: %bb.8(0x80000000)
>   
>     %35:intregs = PHI %40, %bb.10, %20, %bb.5
>     %38:intregs = L2_loadruh_io %35, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
>     J2_jump %bb.8, implicit-def $pc
>   
>   bb.8.b2:  // Next stage of Epilog 1, runs stage 2
>     successors: %bb.7(0x80000000)
>   
>     dead %34:intregs = S2_storerh_pi %6, -2, %38, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
>     J2_jump %bb.7, implicit-def $pc
>   
>
> The key is that though E1 runs stages {1,2}, we *don't* create a block with both stages {1,2} enabled. This would cause the invalid code issue you mentioned. Instead, we expand this into *two* blocks. The first performs stage 1, the second stage 2 which consumes its input from stage 1.
>
> That means we do generate superfluous epilog blocks, but these get merged by the control flow optimizer later.
>
> That said, I'm not guaranteeing there are no bugs here. Perhaps the testcase you're thinking of distills to something more complex than my testcase?
>
> Cheers,
>
> James


Thanks for explaining. That makes sense, I missed that we peel the two phases in two steps. I wonder if there is a way to express it easily in the comment. Anyway it looks like it works, I think the case I saw was the same pattern so it should work.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68205/new/

https://reviews.llvm.org/D68205





More information about the llvm-commits mailing list