[llvm-dev] Debug Locations for Optimized Code

Robinson, Paul via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 7 13:11:25 PST 2016


Is there a reason why we must only have one location for every instruction? If not, why not merge them and keep them all?

Not a requirement - of course we could keep them all with some kind of ordered list and even potentially include a "this is the one we would've picked" info (eg: the first one's the one we would pick today, if we would've picked one rather than none) so we could be backwards compatible if desired.

That would be a lot of engineering work to plumb through LLVM the notion of multiple debug locations, I think.

I'm not sure how DWARF (or CodeView) and its consumers currently copes with multiple locations - it's probably technically possible to describe using the line table format (not sure if it's intentional/documented for that purpose), but existing consumers might have to be fixed not to trip over it.

Technically the DWARF encoding of the line table does allow it, I've seen it happen, but not with the intent of describing two real source locations; it was by accident.  (And was one of the things that prompted me to submit patch D27492.)  I seriously doubt any DWARF consumer takes the trouble to look for it.  It's really not clear how a debugger *should* respond to seeing two source locations for one instruction.
--paulr

From: David Blaikie [mailto:dblaikie at gmail.com]
Sent: Wednesday, December 07, 2016 10:27 AM
To: Hal Finkel; Robinson, Paul
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Debug Locations for Optimized Code


On Wed, Dec 7, 2016 at 10:20 AM Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote:
----- Original Message -----
> From: "Paul Robinson" <paul.robinson at sony.com<mailto:paul.robinson at sony.com>>
> To: "Hal Finkel" <hfinkel at anl.gov<mailto:hfinkel at anl.gov>>, "David Blaikie" <dblaikie at gmail.com<mailto:dblaikie at gmail.com>>
> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> Sent: Wednesday, December 7, 2016 9:39:16 AM
> Subject: RE: [llvm-dev] Debug Locations for Optimized Code
>
> >> I don't know what the right, if any, solution to this is - but I
> >> thought I should bring it up in case you or anyone else wanted to
> >> puzzle it over & see if the competing needs/desires might need to
> >> be
> >> considered.
> > One thing that I recall being discussed was changing the way that
> > we
> > set the is_stmt flag in the DWARF line-table information. As I
> > understand it, we currently set this flag for the first instruction
> > in
> > any sequence that is on the same line. This is, in part, why the
> > debugger appears to jump around when stepping through code with
> > speculated instructions, etc. If we did not do this for
> > out-of-place
> > instructions, then we might be able to keep for debugging
> > information
> > for tools while still providing a reasonable debugging experience.
>
> When we are looking at a situation where an instruction is merely
> *moved*
> from one place to another, retaining the source location and having a
> less naïve statement-marking tactic could help the debugging
> experience
> without perturbing other consumers (although one still wonders
> whether
> profiles will get messed up in cases where e.g. a loop invariant gets
> hoisted out of a cold loop into a hot predecessor).
>
> When we are looking at a situation where two instructions are
> *merged* or
> *combined* into one, and the original two instructions had different
> source locations, that's a separate problem.  In that case there is
> no
> single correct source location for the new instruction, and typically
> erasing the source location will give a better debugging experience
> (also
> a less misleading profile).

Is there a reason why we must only have one location for every instruction? If not, why not merge them and keep them all?

Not a requirement - of course we could keep them all with some kind of ordered list and even potentially include a "this is the one we would've picked" info (eg: the first one's the one we would pick today, if we would've picked one rather than none) so we could be backwards compatible if desired.

That would be a lot of engineering work to plumb through LLVM the notion of multiple debug locations, I think.

I'm not sure how DWARF (or CodeView) and its consumers currently copes with multiple locations - it's probably technically possible to describe using the line table format (not sure if it's intentional/documented for that purpose), but existing consumers might have to be fixed not to trip over it.

It'd certainly be cute/fun/nice to have the extra fidelity (though all extra fidelity also comes at a size cost to the IR and the resulting object/executable files).

Not sure anyone's in a position to sign up for that work right now - but maybe someone is. (looks like Apple's making a bit of a push on optimized debug info quality at the moment)

- David


 -Hal

>
> My personal opinion is that having sanitizers *rely* on debug info
> for
> accurate source attribution is just asking for trouble.  It happens
> to
> work at –O0 but cannot be considered reliable in the face of
> optimization.
> IMO this is a fundamental design flaw; debug info is best-effort and
> full
> of ambiguities, as shown above. Sanitizers need a more reliable
> source-of-truth, i.e. they should encode source info into their own
> instrumentation.
>
> --paulr
>
>

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/4a47538e/attachment.html>


More information about the llvm-dev mailing list