[LLVMdev] Marking source locations without interfering with optimization?

Fri Aug 19 18:56:55 PDT 2005

I've been thinking of adding an instruction, and I'm following the
advice in the docs to consult the list before doing something rash.

What I want to do is provide a way to identify variable names and
source locations that doesn't affect the effectiveness of
optimizations. This is not the same problem as supporting debug info,
because I don't care about being able to look up unique names for
memory locations or evaluating expressions, etc... I just want to be
able to say during an optimization pass what the best guess for the
source location and variable names are for a value or instruction that
the pass is doing something interesting to.

Because I don't need to support the functionality of a debugger with
this, it is OK if that best guess contains more than one possibility,
as long as it isn't a huge number of possibilities. The idea is that
I'm producing information for a programmer who needs to know what is
going on during optimization, so I want to give them as much detail as
possible, it's OK if it isn't exact, but it is not OK if it interferes
with the optimization, because that's the whole point.

So, given those goals, it seems that just using the traditional debug
info as it is designed is not a good idea, since I want more and
fuzzier answers.

Also, unless I'm missing something, the debug info uses intrinsic
function calls, which are treated as un-analyzable, and if I tried
supplying those with actual values to link the values to the source,
then some important analyses will fail. Is that right or am I
misunderstanding the docs on intrinsics?

So, I thought one way to go would be to introduce an instruction meant
just for marking the source location of a value - it'd consume a value
and some constants marking the location - then the front end could
generate it (not by default!) where necessary to make sure a value
could be traced back to its source location. It'd either be lowered
away or it'd have to be ignored during codegen since we might still
want to know that info then, for instance, to track register spills
back to which variable spilled.

What problems can you think of with that approach? Am I asking for
trouble with passes, or would a semantically meaningless 'marker'
instruction be OK?

If you have suggestions for a better way to do this, that'd be great.
There isn't a lot of prior work I found on this, most of what I saw
was about debug info, which as I stated, is not quite what I need.

Thanks!

-- 
Michael McCracken
UCSD CSE PhD Candidate
research: http://www.cse.ucsd.edu/~mmccrack/