[LLVMdev] Reversing a function's CFG?

Thu Mar 24 04:21:41 PDT 2011

This is in reply to the posting below.

I am not a compiler writer type, so I am probably in over my head a bit.

Several years ago I was a bit interested in something called code slicers. 
An example of one (probably no longer supported) is UNRAVEL
(http://www.itl.nist.gov/div897/sqg/unravel/unravel.html )

Their basic idea is to identify the algorithm which serves to determine the
value of variable at a specific location in the code.
That is they seek to determine all the paths which influence a computation
result. They do not make any assumptions about the possible paths which lead
to the specific location. (ergo many such paths)
They accomplish this by walking the computation backwards and pruning out
stuff which is just not relevant (you unfortunately cannot prune out the
stuff which is not relevant)

You appear to have the very same problem (except the Boolean evaluation
which serves to indicate a need for backtracking), with a minor twist. You
have instrumentation which allows you to determine the path.

The strategy which they use (and the problems which they experience) are
likely to be the same problems as you will experience

This is what I see
(1) There will be functions which cannot be inverted. Identifying them is
the principle task. They are the locations where in addition to control
information, data checks are necessary
(2) In the instrumented code you know the locations where the results have
to be reversed. You can infer the location in the actual code. Construct the
paths based not on the instrumented code, but the actual code 
(3) There are probably only a finite number of paths, each identified by a
different instrumentation configuration (ie a state variable). You reverse
the erroneous result by running a pre-compiled procedure generated from the
reverse path. 
(4) Associate the reversing algorithm with the state and execute when
appropriate

Dave Lightstone

////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
/

Date: Wed, 23 Mar 2011 13:52:40 -0400
From: "Justin M. LaPre" <justin at lapre.com>
Subject: Re: [LLVMdev] Reversing a function's CFG?
To: joshuawarner32 at gmail.com
Cc: llvmdev at cs.uiuc.edu
Message-ID: <5B806BA1-94DB-4EF0-B526-511F6EDFB28B at gmail.com>
Content-Type: text/plain; charset="us-ascii"

On Mar 23, 2011, at 12:24 PM, Joshua Warner wrote:

> Hi Justin,
> 
> I take the fact that nobody has replied as a sign that nobody really
understands what you are asking.

Most likely.  I was trying not to write a ten page e-mail to the list but
quite possibly simplified too much.  I'll try to elaborate a little more:

In large-scale parallel simulations, sometimes you process an event
speculatively.  If it turns out you should not have processed that event
(and such situations can be detected by our system), you need to undo all of
the changes you made in your event handler.  Most approaches use some form
of state-saving.  We opt for an approach coined by my advisor (Chris
Carothers, actually) called "reverse computation."  The reverse function is
basically the inverse of the forward event handler e.g. if in the forward
event handler you increment a variable, the reverse handler would need to
decrement it.  That's an extremely simple case.  A more complicated case: if
you have an "if" in your forward handler, you must augment it with a
bitfield to remember which path you took so the reverse handler can find its
way back.  So we're basically saving our control state as opposed to our
data state as the control state is often significantly smaller.  I like to
think of it as a trail of breadc!
 rumbs.  Using different approaches we can handle loops, ifs, switches, etc.
by following our "breadcrumbs" back.

> I was wondering if there was a quick way to reverse a function's CFG 
> and, in turn, all basic blocks within it.  Assuming all variables are 
> globals, is there a quick way to generate a function's reversal?  I 
> highly doubt such functionality exists but I figured it was worth 
> asking.  I'm trying to develop an "undo function" generator pass that 
> would be able to restore system state (without state-saving) after 
> determining an error occurred.  This is documented in, "Efficient 
> Optimistic Parallel Simulations using Reverse Computation" by 
> Carothers et al.[1]
>  
> 
> I'm unaware of an easy way to "reverse the CFG" - but even if there was, I
don't think that would solve your problem.  First, you will only be able to
generate "undo" functions for a small subset of all possible functions -
specifically, invertible functions.  Second, my intuition is that computing
an inverse function will be significantly more involved than just reversing
the CFG.

Reversing the CFG is just a piece of the puzzle.  To generate our reverse
handler, I thought a good place to start would be the reverse CFG.
Specifically, I need to:

1. instrument the forward handler control path 2. emit the reverse handler
given the forward handler (which I believe can be achieved by the following)
2a. create the reverse CFG 2b. for each basic block in (2a), invert the
instructions

Unfortunately, not all functions are invertible as you said.  In those
cases, we may opt to fall back on state-saving or use some heuristics to try
and bypass state-saving (memory operations can be expensive on some of the
machines we run simulations on). 

Assuming the above steps go off without a hitch (which is a big assumption),
we should have our reverse event handler generated for us.

I do have some questions: is there any way to differentiate IR I inserted
while instrumenting the forward path from IR that was there before?  I don't
want to invert my instrumentation instructions.  Also, due to this problem,
I've been attempting to write a Module pass as opposed to a Function pass
because I couldn't differentiate between the two.

> Could you be a little more specific about the problem you are dealing
with?  How do Carothers et al. do the inversion?

Carothers did it all by hand! :)  In the paper he talks about automating it.
As I have taken a few compiler courses, apparently I'm the compiler guy in
our HPC group.

Anyway, thanks for the response.  If I was unclear in any way, please ask
questions!

-Justin