[LLVMdev] Optimization passes and debug info

Thu Jul 24 14:41:26 PDT 2008

Hi Chris,

> > 	1) -g should not affect the outputted code
> > 	2) Transformations should preserve as much debug info as possible
> 
> I don't see how choosing between the two goals is necessary, you can  
> have both.  Take a concrete example, turning:
> 
>   if (c) {
>      x = a;
>    } else {
>      x = b;
>    }
> 
> into:
> 
>    x = c ? a : b
> 
> This is a case where our debug info won't be able to represent the  
> xform correctly
So that directly means choosing between two goals: You can either do the
transformation, but change debug info, or you can keep debug info (and thus
single stepping capabilities, for example) intact but that changes the
resulting code output.

> Why?  Is this is an "optimize as hard as you can without breaking  debug
> info" flag?  Who would use it (what use case)?
The use case I see is when a bug is introduced or triggered by a
transformation. Ie, I observe a bug in my program. I compile with -g -O0
(since I want full stepping capabilities), but now the bug is gone.

So, I compile with -g -O2 and the bug is back, but my debugging info is
severely crippled, making debugging a lot less fun.

In this case, having the option of making the compiler try a bit harder to
preserve debugging info as useful to ease debugging. As pointed out by Luke,
there areas in which this is particularly important (he names parallel
programming and synchronization). I do think that the weak point of this
argument is that it the best it gets you is that debugging might get easier,
if you're lucky, but it might also make the bug vanish again.

To make this more specific, however, say that I have two nested loops. 
	do {
		do {
			foo();
		} while (a());
		bar();
	} while (b());

When compiled, the loop header of the inner loop is in a lot of cases an empty
block, containing only phi nodes and an unconditional branch instruction. (Not
sure if the above example does this, I don't have clang or llvm-gcc at hand atm).

There is code in simplifycfg to remove such a block, which is possible in a
lot of cases. However, when debugging info is enabled, a stoppoint will be
generated inside such a block. This stoppoint represents the start of the
inner loop (ie, just before the inner loop is exected for the first time, not
the beginning of every iteration).

By default (at in your approach, always) the basic block is removed and the
stoppoint thrown away. This means that a fairly useful stoppoint is removed,
even at -O1 (since simplifycfg will run then).

I can see that in most cases, debugging at -O0 is probably sufficient.
However, I can't help but thinking that even in a debugging build, perfect
debugging info should be combinable with (partial) optimization.

I'm no longer sure that it is as important as I initally thought, though it
still feels like a shame if we would have no way whatsoever to be a bit more
conservative about throwing away debug info.

> There is no balance here, the two options are:
> 
> 1) debug info never changes generated code.
> 2) optimization never breaks debug info.
> 
> The two are contradictory (unless all optimizations can perfectly  
> update debug info, which they can't), so it is hard to balance  
> them :).  
Especially because these two options are so contradictory, I can see a third
option in the middle. The above two options, corresponding to the outer
levels, are easy. If you can't update debug information through a
transformation, you either ignore that (option 1) or leave the code unchanged
(option 2). An extra middle level would try to find a balance. If the loss of
debug info is "small", you go ahead with the transformation, but if you lose
"a lot" of debug information, leave the code. The tricky part here is to
define where the border between "small" and "a lot" is, but that could be left
a little vague.

> My perspective follows from use cases I imagine for C family  
> of languages: I'll admit that other languages may certainly want #2.   
> Can you talk about why you want this?
As stated above, I don't have a particularly solid reason, other than a decent
hunch of usefulness.

Since Devang originally proposed the three-level scheme (I originally thought
of having two levels only), perhaps he has some particular motivation to add
to this discussion? :-)

How would it be to add the proposed debugging levels, update some of the
passes and see how it turns out? I'm not sure I can invest enough time to
fully see this one through, though, since I'm going from fulltime to one day
per week after next week...

If we would add such a level, would you agree that the PassManager is a good
place to store it?

Gr.

Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080724/788dd1a7/attachment.sig>