[llvm-dev] Debug Locations for Optimized Code

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 7 10:19:04 PST 2016


----- Original Message -----

> From: "Paul Robinson" <paul.robinson at sony.com>
> To: "Reid Kleckner" <rnk at google.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "David Blaikie"
> <dblaikie at gmail.com>, llvm-dev at lists.llvm.org
> Sent: Wednesday, December 7, 2016 11:01:57 AM
> Subject: RE: [llvm-dev] Debug Locations for Optimized Code

> I don't see how ASan and debuggers are different. It feels like both
> need reasonably accurate source location attribution for any
> instruction. ASan just happens to care more about loads and stores
> than interactive stepping debuggers.

> Actually they are pretty different in their requirements.

> ASan cares about *accurate* source location info for *specific*
> instructions, the ones that do something ASan cares about. The
> source attributions for any other instruction is irrelevant to ASan.
> The source attributions for these instructions *must* survive
> optimization.

> Debuggers care about *useful* source location info for *sets* of
> instructions, i.e. the instructions related to some particular
> source statement. If that set is only 90% complete/accurate, instead
> of 100%, generally that doesn't adversely affect the user
> experience. If you step past statement A, and happen to execute one
> or two instructions from the next statement B before you actually
> stop, generally that is not important to the user. Debuggers are
> able to tolerate a moderate amount of slop in the source
> attributions, because absolute accuracy is not critical to correct
> operation of the debugger. This is why optimizations can get away
> with dropping attributions that are difficult to represent
> accurately.

> ASan should be able to encode source info for just the instructions
> it cares about, e.g. pass an index or other encoded representation
> to the RT calls. Being actual parameters, they will survive any
> correct optimization, unlike today's situation where multiple calls
> might be merged by an optimization, damaging the correctness of ASan
> reports. (We've see this exact thing happen.) ASan does not need a
> line table mapping all instructions back to their source; it needs a
> parameter at each call (more or less). It does need a file table,
> that's the main bit of redundancy with debug info that I see
> happening.
I suspect that you misunderstand where ASan instrumentation is added. Unlike UBSan, which is added by Clang during initial IR generation, ASan instrumentation is added late (at the EP_OptimizerLast extension point). I don't see any better way to get the location information at that point than using the existing debug info. 

-Hal 

> --paulr

> From: Reid Kleckner [mailto:rnk at google.com]
> Sent: Wednesday, December 07, 2016 8:23 AM
> To: Robinson, Paul
> Cc: Hal Finkel; David Blaikie; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] Debug Locations for Optimized Code

> On Wed, Dec 7, 2016 at 7:39 AM, Robinson, Paul via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> When we are looking at a situation where an instruction is merely
> *moved*
> from one place to another, retaining the source location and having a
> less naïve statement-marking tactic could help the debugging
> experience
> without perturbing other consumers (although one still wonders
> whether
> profiles will get messed up in cases where e.g. a loop invariant gets
> hoisted out of a cold loop into a hot predecessor).

> When we are looking at a situation where two instructions are
> *merged* or
> *combined* into one, and the original two instructions had different
> source locations, that's a separate problem. In that case there is no
> single correct source location for the new instruction, and typically
> erasing the source location will give a better debugging experience
> (also
> a less misleading profile).

> My personal opinion is that having sanitizers *rely* on debug info
> for
> accurate source attribution is just asking for trouble. It happens to
> work at –O0 but cannot be considered reliable in the face of
> optimization.
> IMO this is a fundamental design flaw; debug info is best-effort and
> full
> of ambiguities, as shown above. Sanitizers need a more reliable
> source-of-truth, i.e. they should encode source info into their own
> instrumentation.

> I don't see how ASan and debuggers are different. It feels like both
> need reasonably accurate source location attribution for any
> instruction. ASan just happens to care more about loads and stores
> than interactive stepping debuggers.

> It really doesn't make sense for ASan to invent another mechanism to
> track source location information. Any mechanism we build would be
> so redundant with debug info that, as an implementation detail, we
> would find a way to make them use the same storage when possible.
> With that in mind, maybe we should really find a way to mark source
> locations as "hoisted" or "sunk" so that we can suppress them from
> our line tables or do something else clever.
-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/e71b62f1/attachment.html>


More information about the llvm-dev mailing list