[LLVMdev] Marking source locations without interfering with optimization?

Tue Aug 23 17:01:28 PDT 2005

On Tue, 23 Aug 2005, Michael McCracken wrote:
>> Okay...  this is tricky.  Anything that will bind to variables will
>> prevent modification to the variable.
>
> I see - so if I wanted to use my earlier approach, I'd need to change every
> optimization and analysis to treat the 'marker' instructions specially as
> instructions that don't modify their argument, a big mess...

exactly.

> So it sounds like the only way to really not interfere with
> optimizations is to avoid
> binding to the variables, which means that if instructions are moved
> or copied, the markers I add won't be moved or copied along with the
> instruction. I was hoping to find a scheme that'd stay (mostly)
> up-to-date through modifications with minimal extra changes.

I don't really think there is a good way to do that.

>> I would suggest something like
>> this (C syntax for the llvm code):
>>
>> int foo() {
>>    %A = alloca int
>>    llvm.myintrinsic("A", whatever data you want")
>> }
>
> Just to clarify, you're suggesting that I use the LLVM value's name to
> link up with the source info instead of actually binding to it - so in
> a slightly more complicated example I might do this:
>
> C code:
>
> 1: a = foo();
> 2: b = bar();
> 3: a = a + b;
>
> llvm code:
>
> %a = call foo()
> llvm.myintrinsic("%a", "a", 1)
> %b = call bar()
> llvm.myintrinsic("%b", "b", 2)
> %tmp.1 = add %a, %b
> llvm.myintrinsic("%tmp.1", "a", 3)

Exactly.  The %'s are a figment of the asmprinter's imagination, so you 
wouldn't need to include them, but this is basically what I was getting 
at.

-Chris

>> Given the above, you can use the constant string "A", to look up things in
>> the symbol table of the function.  You will probably want to accept "A"
>> and anything that starts with "A.".
>>
>>> So, I thought one way to go would be to introduce an instruction meant
>>> just for marking the source location of a value - it'd consume a value
>>> and some constants marking the location - then the front end could
>>> generate it (not by default!) where necessary to make sure a value
>>> could be traced back to its source location. It'd either be lowered
>>> away or it'd have to be ignored during codegen since we might still
>>> want to know that info then, for instance, to track register spills
>>> back to which variable spilled.
>>
>> I think the above will work for you, you can make it ignored or deal with
>> it however you want using the intrinsic lowering code.  Check out how
>> other intrinsics are handled (e.g. llvm.isunordered, which is handled by
>> the code generators and llvm.dbg.* which are not) for ideas.
>>
>>> What problems can you think of with that approach? Am I asking for
>>> trouble with passes, or would a semantically meaningless 'marker'
>>> instruction be OK?
>>
>> I'd seriously suggest using an intrinsic instead of an instruction: they
>> are far far easier to add.  Aside from that, using the symbol table is
>> really the only thing that will work, and is prone to obvious problems,
>> but should work pretty well in practice.
>>
>>> If you have suggestions for a better way to do this, that'd be great.
>>> There isn't a lot of prior work I found on this, most of what I saw
>>> was about debug info, which as I stated, is not quite what I need.
>>
>> Hope this helps!
>
> It's certainly given me lots to think about.
>
> Thanks,
> -mike
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/