[llvm-dev] Basic block merging

Shawn Landden via llvm-dev llvm-dev at lists.llvm.org
Wed May 29 11:31:16 PDT 2019


On Wed, May 29, 2019 at 10:49 AM David Jones via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> Under certain circumstances, my compiler outputs basic blocks having the same function:
>
> bb_97:                                            ; preds = %bb_1
>   %476 = getelementptr inbounds %LMtop.I0.ARType, %LMtop.I0.ARType* %0, i64 0, i32 6
>   %477 = bitcast i8** %476 to %LBstd.Cprocess.CRType**
>   %478 = load %LBstd.Cprocess.CRType*, %LBstd.Cprocess.CRType** %477, align 8
>   %479 = getelementptr inbounds %LBstd.Cprocess.CRType, %LBstd.Cprocess.CRType* %478, i64 0, i32 3
>   %480 = bitcast i8** %479 to %LMtop.I0.ARType**
>   store %LMtop.I0.ARType* %0, %LMtop.I0.ARType** %480, align 8
>   br label %bb_106
>
> bb_98:                                            ; preds = %bb_1
>   %481 = getelementptr inbounds %LMtop.I0.ARType, %LMtop.I0.ARType* %0, i64 0, i32 6
>   %482 = bitcast i8** %481 to %LBstd.Cprocess.CRType**
>   %483 = load %LBstd.Cprocess.CRType*, %LBstd.Cprocess.CRType** %482, align 8
>   %484 = getelementptr inbounds %LBstd.Cprocess.CRType, %LBstd.Cprocess.CRType* %483, i64 0, i32 3
>   %485 = bitcast i8** %484 to %LMtop.I0.ARType**
>   store %LMtop.I0.ARType* %0, %LMtop.I0.ARType** %485, align 8
>   br label %bb_106
>
> These blocks have provably the same function - the instructions of one correspond perfectly to instructions in the other.
>
> I would have expected some optimization to detect this, and merge the blocks by forwarding all jumps to one, to jump to the other.
>
> I notice a "merge functions" pass, which does this if two functions can be proven to have identical effect.
>
> Is there a pass that merges blocks? Is it enabled by default?  If so, then is there something non-obvious in the code that would be defeating this optimization?
I see two problems here (neither unsolvable): 1) I believe the
register allocator is working over a full function, and for this merge
the registers and stack/frame offsets have to be the same. 2) if the
block very small the icache might not work as well, and on some arches
long jumps are more expensive than short jumps and require trunks.

Basically, if this was to be done, the blocks that are merged would
have to be turned into functions (with the fastcc calling convention).

-Shawn Landden


More information about the llvm-dev mailing list