[llvm-dev] Libfuzzer depending on uninitialized debug info

Robinson, Paul via llvm-dev llvm-dev at lists.llvm.org
Thu Dec 1 11:58:35 PST 2016



> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
> Robinson, Paul via llvm-dev
> Sent: Thursday, December 01, 2016 11:08 AM
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] Libfuzzer depending on uninitialized debug info
> 
> TL;DR:  LibFuzzer appears to depend on debug-info source locations for
> whatever IR instrumentation it uses; however, that instrumentation does
> not have proper source locations attached to it, leading to potentially
> incorrect reporting.  The short-term fix is to make sure the debug info
> it needs is actually set up; the long-term fix is not to rely on debug
> info, because some optimizations will (correctly) erase it.

Another way of looking at the problem is:
The fuzzer (or the sanitizer instrumentation it is using) is depending
on metadata for correctness, which is a no-no.
--paulr

> 
> The long version:
> 
> When Clang generates IR with debug info, one thing it does is attach a
> source location to most IR instructions.  This source location (at least
> in principle) is carried through optimizations, SelectionDAG, MachineIR,
> assembler source, and ultimately ends up in the "line table" in the
> object file.  The line table describes a mapping from the virtual
> addresses of instructions to source locations, which is very useful to
> debuggers and other tools.
> 
> Not all IR instructions have a source location attached to them.  When
> that happens, no specific line-table record is emitted for any machine
> instruction produced from that IR instruction.  In DWARF, that means you
> assume the instruction belongs to the same source location as the
> instruction that precedes it in memory.
> 
> This is a problem when the first instruction in a machine-basic-block has
> no explicit source location, because it implicitly inherits the source
> location of the last instruction of the basic block that precedes it in
> memory.  That means, the source location is entirely at the mercy of
> block layout and other optimizations.
> 
> In effect, the source location for that instruction is UNINITIALIZED.
> 
> In r288283, I committed a patch that explicitly initialized the line
> number for some instructions to line 0.  The DWARF spec says that line 0
> means there is no specific source location for the instruction. Debuggers
> and other tools generally respond to this looking *forward* in the
> instruction stream to find the *next* instruction with an explicit non-0
> location, rather than backward to the *previous* instruction with an
> explicit location.
> 
> This caused a libFuzzer test to fail, because it depended on seeing a
> real source location for something, and got line 0 instead.  This tells
> me libFuzzer is depending on an uninitialized source location.  Kostya
> backed out that patch for me, but we really want to have it for improved
> debugger single-stepping behavior.
> 
> I am unclear on what instrumentation the fuzzer is using, although the
> instructions for building it suggest it's ASAN instrumentation. Whatever
> it is, either the instrumentation should use its own source-location
> information scheme, or it should initialize the debug info that it is
> depending on.
> 
> Note that debug info is not necessarily reliable in the face of
> optimization.  If two blocks with different source locations get merged,
> most likely the source location will be zeroed (and that's not my patch,
> that's optimization-specific behavior).  Therefore, I would recommend
> that fuzzer/asan/whoever stop relying on debug info for source locations,
> if we want all that to work on optimized code.
> 
> In the short term it's probably easier to find places where the
> instrumentation is missing debug info, and add it.  But that's not going
> to be reliable for optimized code.
> --paulr
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list