[LLVMdev] Reversing a function's CFG?

Wed Mar 23 11:58:22 PDT 2011

Forgot to CC the list.

On Wed, Mar 23, 2011 at 12:51 PM, Joshua Warner <joshuawarner32 at gmail.com>wrote:

>
>> In large-scale parallel simulations, sometimes you process an event
>> speculatively.  If it turns out you should not have processed that event
>> (and such situations can be detected by our system), you need to undo all of
>> the changes you made in your event handler.  Most approaches use some form
>> of state-saving.  We opt for an approach coined by my advisor (Chris
>> Carothers, actually) called "reverse computation."  The reverse function is
>> basically the inverse of the forward event handler e.g. if in the forward
>> event handler you increment a variable, the reverse handler would need to
>> decrement it.  That's an extremely simple case.  A more complicated case: if
>> you have an "if" in your forward handler, you must augment it with a
>> bitfield to remember which path you took so the reverse handler can find its
>> way back.  So we're basically saving our control state as opposed to our
>> data state as the control state is often significantly smaller.  I like to
>> think of it as a trail of breadcrumbs.  Using different approaches we can
>> handle loops, ifs, switches, etc. by following our "breadcrumbs" back.
>>
>> Reversing the CFG is just a piece of the puzzle.  To generate our reverse
>> handler, I thought a good place to start would be the reverse CFG.
>>  Specifically, I need to:
>>
>> 1. instrument the forward handler control path
>> 2. emit the reverse handler given the forward handler (which I believe can
>> be achieved by the following)
>> 2a. create the reverse CFG
>> 2b. for each basic block in (2a), invert the instructions
>>
>> Unfortunately, not all functions are invertible as you said.  In those
>> cases, we may opt to fall back on state-saving or use some heuristics to try
>> and bypass state-saving (memory operations can be expensive on some of the
>> machines we run simulations on).
>>
>> Assuming the above steps go off without a hitch (which is a big
>> assumption), we should have our reverse event handler generated for us.
>>
>> I do have some questions: is there any way to differentiate IR I inserted
>> while instrumenting the forward path from IR that was there before?  I don't
>> want to invert my instrumentation instructions.  Also, due to this problem,
>> I've been attempting to write a Module pass as opposed to a Function pass
>> because I couldn't differentiate between the two.
>>
>> Carothers did it all by hand! :)  In the paper he talks about automating
>> it.  As I have taken a few compiler courses, apparently I'm the compiler guy
>> in our HPC group.
>>
>>
> Thanks - that makes much more sense.   Sounds intriguing!  I would be very
> interested in seeing a pass like this get into the LLVM trunk - it would go
> a long way to providing reverse debugging (wikipedia it) for all languages
> that target LLVM.  There are very few (if any) good, free reverse debuggers
> available for higher-level languages like Java and Python, let alone one for
> C and C++.  That would be amazing!
>
> I don't think there could ever be the sort of pre-built procedure for
> reversing the control flow that you want - it would be heavily dependent on
> the format of the data you produce in the forward computation.  The best way
> to do this is probably just to translate a function block-by-block.  Compute
> the predecessors of the original block, then add a switch instruction to the
> end of the new block that computes how control flow reached that block when
> it was run in the forward.
>
> Inverting each instruction should be just as easy - if its invertible
> (including a call to an instrumented function), just invert, otherwise read
> the recorded state and fix it up.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110323/88f4bdbb/attachment.html>