[LLVMdev] Extending AsmPrinterHandler

Fri May 15 11:52:21 PDT 2015

On 05/14/2015 03:31 PM, Reid Kleckner wrote:
> On Wed, May 13, 2015 at 3:27 PM, Russell Hadley <rhadley at microsoft.com 
> <mailto:rhadley at microsoft.com>> wrote:
>
>     (background) The CoreCLR expects a JIT to produce a MSIL bytecode
>     offset to code offset mapping annotated with a few extra bits
>     denoting if it’s prolog/epilog, or it’s a call, or if there’s
>     operands remaining on the MSIL virtual stack in some cases.  Our
>     initial prototype has the MSIL offset stashed in the line number
>     field.  We could stash the extra bits in the column info but
>     that’s starting to feel too much like a hack.  We’re looking for a
>     way to 1) extend the debug metadata to hold our info and get it
>     dumped into the in memory object – a new section would be fine if
>     it’s not too complicated. Or 2) a place to extract the data we
>     need when we have both encoded offset and access to the
>     instructions.  We’re looking for some advice. J
>
>
> This sounds like it's not really debug info so much as a description 
> of the stack frame that is required for correctness, like CFI (call 
> frame info that describes prologues and epilogues) and EH action 
> tables. You probably want to subclass AsmPrinterHandler and hook that 
> into the pipeline along with EH and debug info generation. Today this 
> requires upstream modification, but the actual pass code can live 
> where ever you want. Take a look at how Win64Exception.cpp and others 
> are emitting things like the ip2state table for __CxxFrameHandler3.
>
> Long term, if you want to 100% guarantee that the MSIL offset is 
> preserved through LLVM optimizations, I think we need some other 
> solution. Phillip Reames was describing a similar problem, and I was 
> thinking that we should have a way to tack semantically important data 
> onto a function call like this. The best solution I could come up with 
> using existing tools was to use an invoke that unwinds to an 
> artificial landing pad that ends in unreachable and contains the 
> preserved data in its clause operands. LLVM optimizers will only merge 
> such calls if the landingpad destinations are the same, and it can't 
> merge landingpads with different clauses.
>
> Alternatively, it occurs to me that call sites support attributes, 
> which are different from metadata in that they are semantically 
> important. Optimizations cannot remove them. Maybe what we need is 
> just an attribute on the call site?
>
> Hope that helps. :)
FYI, if these are semantically important (and not just debug info) using 
metadata is a really bad idea.  We've got a similar problem with 
information required to support deoptimization and have local changes 
which mostly solve it.

I hope to eventually get that upstreamed, but we're not particularly 
happy with what we've got at the moment and are the process of a 
rewrite.  If you're interested, I can try to do that rewrite upstream.  
If I do, it'll be with the caveat that the code upstreamed will be 
*extremely* experimental and likely to change radically over time.

Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150515/0b6bb6e9/attachment.html>