[LLVMdev] Debug Info and DFSan

David Blaikie dblaikie at gmail.com
Tue Oct 7 16:10:08 PDT 2014


On Tue, Oct 7, 2014 at 3:41 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:

> On Tue, Oct 07, 2014 at 03:30:49PM -0700, David Blaikie wrote:
> > On Tue, Oct 7, 2014 at 2:51 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
> >
> > > Looks good, thanks!
> > >
> > > Can you write the test case, please? You probably have more experience
> > > writing debug info tests than I do.
> > >
> >
> > Sure - though how would I get the pre-dfsan .ll file to produce this
> > behavior? I've tried compiling to a .ll file without dfsan, then feeling
> > that .ll through opt -dfsan, and I got different output. Instead of
> > wrapping main, it just replaced it. Is there some attribute or other
> thing
> > the frontend is adding to ensure main is wrapped, rather than converted?
>
> The pass takes an ABI list file, which in this case needs to contain the
> following two entries in order to tell the pass that main is
> uninstrumented:
>
> fun:main=uninstrumented
> fun:main=discard
>
> Please take a look at the test/Instrumentation/DataFlowSanitizer/abilist.ll
> test case for an example of how to specify an ABI list (you only need the
> -dfsan-abilist flag).
>

Lovely. Committed in r219249.


>
> Peter
>
> >
> >
> > >
> > > On Tue, Oct 07, 2014 at 02:35:49PM -0700, David Blaikie wrote:
> > > > Here's a basic patch which would solve it in sort of the same way as
> the
> > > > other optimizations I was fixing (just special case the debug info &
> fix
> > > it
> > > > up). I can work up a test case for this as well, or you can, if you
> > > > like/this seems reasonable.
> > > >
> > > > On Tue, Oct 7, 2014 at 2:30 PM, Peter Collingbourne <peter at pcc.me.uk
> >
> > > wrote:
> > > >
> > > > > On Tue, Oct 07, 2014 at 12:20:55PM -0700, David Blaikie wrote:
> > > > > > On Tue, Oct 7, 2014 at 12:18 PM, David Blaikie <
> dblaikie at gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Oct 7, 2014 at 12:10 PM, David Blaikie <
> dblaikie at gmail.com
> > > >
> > > > > wrote:
> > > > > > >
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, Oct 7, 2014 at 11:48 AM, Peter Collingbourne <
> > > peter at pcc.me.uk
> > > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> On Tue, Oct 07, 2014 at 10:04:30AM -0700, David Blaikie
> wrote:
> > > > > > >>> > Hi Peter,
> > > > > > >>> >
> > > > > > >>> > After discovering several bugs in ArgumentPromotion and
> > > > > > >>> > DeadArgumentElimination where llvm::Functions were replaced
> > > with
> > > > > > >>> similar
> > > > > > >>> > functions (with the same name) to transform their type in
> some
> > > > > way, I
> > > > > > >>> > started looking at all calls to llvm::Function::takeName to
> > > see if
> > > > > > >>> there
> > > > > > >>> > were any other debug info quality bugs in similar callers.
> > > > > > >>> >
> > > > > > >>> > One such caller is the DataFlowSanitizer, and I don't see
> any
> > > debug
> > > > > > >>> info
> > > > > > >>> > tests for this so I'm wondering what /should/ happen here.
> > > > > > >>> >
> > > > > > >>> > Is DFSan+DebugInfo something that matters? I assume so.
> > > > > > >>>
> > > > > > >>> It may be important in the future, but at the moment the
> dfsan
> > > > > runtime
> > > > > > >>> library
> > > > > > >>> does not make use of debug info. The debug info could still
> be
> > > > > useful for
> > > > > > >>> regular debugging tasks though.
> > > > > > >>>
> > > > > > >>
> > > > > > >> Yeah - that'd be the baseline. I assume some people probably
> care
> > > > > about
> > > > > > >> being able to debug a dfsan binary. Not sure if your users
> have
> > > > > > >> specifically highlighted this situation or come across the
> bugs
> > > I'm
> > > > > > >> speculating about.
> > > > > > >>
> > > > > > >>
> > > > > > >>> > It looks like DFSan is creating wrappers (in/around
> > > > > > >>> > DataFlowSanitizer.cpp:680-700) - when it does this, should
> it
> > > > > update
> > > > > > >>> the
> > > > > > >>> > debug info for these functions? Or are these internal
> > > > > instrumentation
> > > > > > >>> > functions & nothing to do with the code the user wrote? I
> can't
> > > > > quite
> > > > > > >>> tell
> > > > > > >>> > from the code.
> > > > > > >>>
> > > > > > >>> The functions created by that part of the code replace the
> > > original
> > > > > > >>> functions,
> > > > > > >>> so they should inherit the debug info for those functions.
> > > > > > >>>
> > > > > > >>
> > > > > > >> Yep, that won't happen for free - we have to stitch it back
> into
> > > the
> > > > > > >> debug info.
> > > > > > >>
> > > > > > >>
> > > > > > >>> But the code below that can also create wrapper functions
> which
> > > do
> > > > > not
> > > > > > >>> need
> > > > > > >>> debug info (lines 712-746). Wrappers normally show up for
> > > > > uninstrumented
> > > > > > >>> functions (e.g. main and many libc functions).
> > > > > > >>>
> > > > > > >>> > Could you provide any C/C++ source examples whis part of
> DFSan
> > > > > fires
> > > > > > >>> > reliably, so I could experiment with some examples and see
> how
> > > the
> > > > > > >>> debug
> > > > > > >>> > info looks?
> > > > > > >>>
> > > > > > >>> This is an example of a program that exercises the
> replacement
> > > > > function
> > > > > > >>> and
> > > > > > >>> wrapper features.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > >
> > >
> --------------------------------------------------------------------------------
> > > > > > >>> #include <stddef.h>
> > > > > > >>> #include <string.h>
> > > > > > >>>
> > > > > > >>> size_t len(size_t (*strlen_ptr)(const char *), const char
> *str) {
> > > > > > >>>   return strlen_ptr(str);
> > > > > > >>> }
> > > > > > >>>
> > > > > > >>> int main(void) {
> > > > > > >>>   return len(strlen, "foo");
> > > > > > >>> }
> > > > > > >>>
> > > > > > >>>
> > > > >
> > >
> --------------------------------------------------------------------------------
> > > > > > >>>
> > > > > > >>> In this example, 'len' is rewritten to 'dfs$len', 'main'
> keeps
> > > its
> > > > > > >>> original
> > > > > > >>> name (the pass treats it as an uninstrumented function), and
> > > > > wrappers are
> > > > > > >>> created for 'main' and 'strlen' (the wrapper for 'main' is
> > > unused as
> > > > > the
> > > > > > >>> C runtime calls the regular 'main' function directly).
> > > > > > >>>
> > > > > > >>
> > > > > > >> OK, so when you say wrappers are created - where does the
> name go?
> > > > > "main"
> > > > > > >> keeps the name? (it's important which way this transformation
> is
> > > done
> > > > > - if
> > > > > > >> the guts of main are moved to a new function and the name
> moved as
> > > > > well,
> > > > > > >> that's not the same as keeping the same function as far as
> debug
> > > info
> > > > > > >> metadata is concerned)
> > > > >
> > > > > In this case the function keeps its original name and the wrapper
> is
> > > > > created
> > > > > separately.
> > > > >
> > > > > > >>> I compile this with '-O0 -g'. A 'break main'/'run'/'break
> > > > > strlen'/'cont'
> > > > > > >>> gives a relevant stack trace:
> > > > > > >>>
> > > > > > >>> #0  __strlen_sse2_pminub () at
> > > > > > >>> ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:33
> > > > > > >>> #1  0x00005555555587ff in __dfsw_strlen (s=0x55555556fe17
> "foo",
> > > > > > >>> s_label=<optimized out>, ret_label=0x7fffffffddee)
> > > > > > >>>     at
> llvm/projects/compiler-rt/lib/dfsan/dfsan_custom.cc:203
> > > > > > >>> #2  0x000055555556bbdc in dfsw$strlen ()
> > > > > > >>> #3  0x000055555556bb51 in len (strlen_ptr=0x55555556bbc0
> > > > > <dfsw$strlen>,
> > > > > > >>> str=0x55555556fe17 "foo") at strlen.c:5
> > > > > > >>> #4  0x000055555556bb96 in main ()
> > > > > > >>>
> > > > > > >>> In this stack trace, #2 is the compiler-generated wrapper
> > > function
> > > > > for
> > > > > > >>> strlen.
> > > > > > >>>
> > > > > > >>> It looks like the debug info for 'len' is preserved
> correctly,
> > > but I
> > > > > > >>> don't
> > > > > > >>> know why the debug info for 'main' is missing.
> > > > > > >>>
> > > > > > >>
> > > > > > >> Yeah - not quite sure either. I'll poke around with it for a
> > > little
> > > > > bit.
> > > > > > >>
> > > > > > >
> > > > > > > DataFlowSanitizer.cpp:719: F.replaceAllUsesWith(WrappedFnCst);
> > > > > > >
> > > > > > > replaces the debug info metadata's pointer to main with a
> pointer
> > > to
> > > > > > > dfsw$main (this issue doesn't come up with DAE or ArgPromo
> because
> > > they
> > > > > > > both don
> > > > > > >
> > > > > >
> > > > > > ... they both don't use RAUW because they have to visit each call
> > > site to
> > > > > > do special stuff anyway. (they're replacing one function with
> > > another of
> > > > > a
> > > > > > different type, RAUW isn't suitable)
> > > > > >
> > > > > > I /guess/ we never end up codegen'ing dfsw$main? (I haven't
> looked)
> > > and
> > > > > > thus the debug info doesn't describe any function at all. It's
> > > possible
> > > > > > that there's some other reason we don't end up describing any
> > > function at
> > > > > > all...
> > > > > >
> > > > > > In any case if we did have "main" in the debug info describe a
> > > function,
> > > > > at
> > > > > > this point, it would be describing the dfsw$main, not main main.
> So
> > > we'd
> > > > > > need to handle remapping the debug info back to the original
> > > > > uninstrumented
> > > > > > function anyway, I assume? (that's the expected behavior? That
> the
> > > > > > uninstrumented function is the one described by debug info, not
> the
> > > > > > instrumented wrapper?)
> > > > >
> > > > > Yes, the debug info describes 'main', not 'dfsw$main'. So I agree
> that
> > > > > the references in the debug info would need to refer to 'main'
> instead
> > > of
> > > > > 'dfsw$main'. Not sure if there is a good way to do that. Perhaps we
> > > could
> > > > > manually RAUW and skip metadata, but I'm not sure if that would
> work
> > > if the
> > > > > debug metadata is not a direct user.
> > > > >
> > > > > Thanks,
> > > > > --
> > > > > Peter
> > > > >
> > >
> > > > diff --git lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
> > > lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
> > > > index 446bcf7..cb84fd6 100644
> > > > --- lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
> > > > +++ lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
> > > > @@ -51,6 +51,7 @@
> > > >  #include "llvm/ADT/StringExtras.h"
> > > >  #include "llvm/Analysis/ValueTracking.h"
> > > >  #include "llvm/IR/Dominators.h"
> > > > +#include "llvm/IR/DebugInfo.h"
> > > >  #include "llvm/IR/IRBuilder.h"
> > > >  #include "llvm/IR/InlineAsm.h"
> > > >  #include "llvm/IR/InstVisitor.h"
> > > > @@ -243,6 +244,7 @@ class DataFlowSanitizer : public ModulePass {
> > > >    DFSanABIList ABIList;
> > > >    DenseMap<Value *, Function *> UnwrappedFnMap;
> > > >    AttributeSet ReadOnlyNoneAttrs;
> > > > +  DenseMap<const Function *, DISubprogram> FunctionDIs;
> > > >
> > > >    Value *getShadowAddress(Value *Addr, Instruction *Pos);
> > > >    bool isInstrumented(const Function *F);
> > > > @@ -573,6 +575,8 @@ bool DataFlowSanitizer::runOnModule(Module &M) {
> > > >    if (ABIList.isIn(M, "skip"))
> > > >      return false;
> > > >
> > > > +  FunctionDIs = makeSubprogramMap(M);
> > > > +
> > > >    if (!GetArgTLSPtr) {
> > > >      Type *ArgTLSTy = ArrayType::get(ShadowTy, 64);
> > > >      ArgTLS = Mod->getOrInsertGlobal("__dfsan_arg_tls", ArgTLSTy);
> > > > @@ -725,6 +729,12 @@ bool DataFlowSanitizer::runOnModule(Module &M) {
> > > >        Value *WrappedFnCst =
> > > >            ConstantExpr::getBitCast(NewF,
> PointerType::getUnqual(FT));
> > > >        F.replaceAllUsesWith(WrappedFnCst);
> > > > +
> > > > +      // Patch the pointer to LLVM function in debug info
> descriptor.
> > > > +      auto DI = FunctionDIs.find(&F);
> > > > +      if (DI != FunctionDIs.end())
> > > > +        DI->second.replaceFunction(&F);
> > > > +
> > > >        UnwrappedFnMap[WrappedFnCst] = &F;
> > > >        *i = NewF;
> > > >
> > >
> > >
> > > --
> > > Peter
> > >
>
> --
> Peter
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141007/5fcf3e7e/attachment.html>


More information about the llvm-dev mailing list