[LLVMdev] global control flow graph at machine code level

Mon Jul 30 02:06:24 PDT 2012

Hi Abhishek,
On Sunday, July 29, 2012 18:32:11 AbhishekR wrote:
> It seems like I may have to modify the way MachineFunction is instantiated in MachineFunctionAnalysis. Instead of doing it per Function, it may have to be done for the entire Module by instantiating MachineFunction objects for every Function inside the Module. This might require major changes to the PassManager framework as well. Is there some work in this direction and code that someone can share? Or an alternative solution?

yes, the MachineFunctionAnalysis creates the MachineFunctions. unfortunately 
a MachineFunction is destroyed along the MachineFunctionAnalysis that created 
it. this happens for instance when you schedule a module pass (where you could 
operate on the global control-flow) somewhere during code generation.

A workaround could be to modify the MachineFunctionAnalysis such that it 
stores the MachineFunction in a look-up table inside the MachineModuleInfo 
instead of destroying it before your module pass is run. once your module 
pass is finished, a new MachineFunctionAnalysis is scheduled by the pass 
manager. Now, instead of creating a new function, you could check the look-up 
table of the MachineModuleInfo to get the original MachineFunction.

it does not appear to be a very complicated change, but there might be some
dirty details that could make this approach hard to implement, e.g., information 
stored with the MachineFunctionAnalysis itself. you could move this information 
to a new class and see the MachineFunctionAnalysis as a wrapper to this class.

a nice property of this solution is that code generation still proceeds on a 
per-function-basis, unless you explicitly insert a module pass. 

another problem is, that there is little (or no) support to construct the 
global control-flow from the machine code. for instance, the call graph is 
based on LLVM-IR. depending on the target architecture, you might have call 
sites in the machine code that were not visible in the LLVM-IR, e.g., when you 
implement floating point operations using library calls. so there might be some
extra work needed to get this infrastructure too.

I have not yet worked on this yet, but I plan to implement something like this
at some point sooner or later. if you find another (better) solution let me 
know.

best,
Florian

-- 
Florian Brandner
Embedded Systems Engineering Group
Department of Informatics and Mathematical Modeling
DTU Informatics
Technical University of Denmark
Richard Petersens Plads
Building 322, room 206
2800 Lyngby
Denmark

phone: +45 45255223
web: http://www.imm.dtu.dk/~flbr/
email: flbr at imm.dtu.dk