[llvm-dev] Update control flow graph when splitting a machine basic block?

章明 via llvm-dev llvm-dev at lists.llvm.org
Sat Nov 11 00:20:16 PST 2017


Thank you for your reply!

> Every MachineBasicBlock has a list of successors; you can access it with 
> the successors() accessor.  That's what you should be using for any CFG 
> analysis.

I am aware of these methods of class MachineBasicBlock, which allows one to access a MachineBasicBlock's successors and predecessors in the CFG.
But the CFG itself may no longer be valid if a MachineBasicBlock is split between two control flow instructions.
The accessors of class MachineBasicBlock do not automatically update the CFG.
So there is no way to access the up-to-date CFG.

> I don't think this actually has any impact in practice; I mean, I guess 
> it's an issue in theory, but in practice we don't stick branches into 
> the middle of basic blocks.

I did not expect a branch in the middle of a basic block either, until yesterday LLVM Release 4.0.0 produced the following machine basic block before the pass ARMConstantIslands is run:

  bb.1.if.end:
    successors: %bb.3.for.body(0x80000000)
    liveins: %r4
  
    %r0 = tMOVr %r4, 14, _, debug-location !23
    tBL 14, _, $__aeabi_i2d, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit-def %sp, implicit-def %r0, implicit-def %r1, debug-location !23
    tBL 14, _, @sqrt, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, implicit-def %r1, debug-location !24
    tBL 14, _, $__aeabi_d2iz, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, debug-location !25
    DBG_VALUE 2, 0, !17, !18, debug-location !27
    DBG_VALUE debug-use %r0, debug-use _, !16, !18, debug-location !26
    tCMPi8 %r0, 2, 14, _, implicit-def %cpsr, debug-location !32
    t2IT 11, 28, implicit-def %itstate
    %r0 = tMOVi8 _, 1, 11, %cpsr, implicit %r0, implicit %itstate
    tPOP_RET 11, %cpsr, def %r4, def %r6, def %r7, def %pc, implicit %r0, implicit %r4, implicit killed %itstate, debug-location !44
    %r1 = t2MOVi 2, 14, _, _
    t2B %bb.3.for.body, 14, _

Note that a terminator tPOP_RET is before a non-terminator t2MOVi.
The command line to produce this is as follows:

llc -mtriple=thumbv7m-none-none-eabi -mcpu=cortex-m3 -O1 -stop-before=arm-cp-islands -o prime-factorize.mir prime-factorize.ll

Attached are the input file prime-factorize.ll and output file prime-factorize.mir.

The machine basic block above violates my previous assumption that only terminators and debug instructions may appear after the first terminator in a machine basic block.
I don't know how "bad" this could be, i.e., how many non-terminators in a program could be generated between two terminators.
So I have to consider split such basic blocks before I could instrument the program with control flow checks.

I don't know whether this is a bug or not. If it is not a bug, this apparently isn't a purely theoretical issue.

Best regards!

Ming Zhang


More information about the llvm-dev mailing list