[LLVMdev] Interprocedural Block Placement algorithm, challenges and opportunities

Wed Mar 19 10:34:38 PDT 2014

Hi,

I have written a code layout feedback directed optimization pass, which
currently works for basic block reordering and function reordering. It very
effectively improves the speedup (we could improve Python by 30%). The
profiling method is window based context sensitive which is based on
reference affinity (
https://urresearch.rochester.edu/fileDownloadForInstitutionalItem.action?itemId=28368&itemFileId=143426
)

The pass works in the IR level. Therefore, it may lose some information
during the machine code optimization passes and perform imprecisely for BB
reordering.

Eventually, I would like to see the improve for an interprocedural basic
block reordering pass. However, with the current system there are several
challenges ahead. The most important is that the CFG is not preserved
during several passes including code-gen-prepare, cfg-simplify,
remove-unreachable-blocks, tail-merge, and tail-duplication. So in order to
keep track of the mapping between MBBs and BBs, one needs to insert code in
every function that modifies the structure of BBs and MBBs.

The current block placement pass (MachineBasicBlockPlacement) works at the
machine code level and with the new profiling structure
(SampleProfileLoader), is effective as far as context-free profiling info
is considered sufficient. However, the implementation of
SampleProfileLoader itself encourages context sensitive info, which cannot
efficiently be provided with the current profiling structure
(<func,lineNo>).

Is there any way to incorporate information into the emitted MBBs so that
we can get IR basic block level info instead of lineNo info?

regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140319/1ba91a8e/attachment.html>