[llvm-commits] MemorySanitizer, first patch

Fri Jul 27 02:26:08 PDT 2012

I'm going to be going over this code with a fine tooth comb over the next
few days and getting you a thorough review, but I wanted to start off by
saying: *wow*. Those are phenomenal performance numbers. I'm very excited
about this.

On Fri, Jul 27, 2012 at 2:20 AM, Kostya Serebryany <kcc at google.com> wrote:

> Fresh numbers after several performance fixes in msan run-time.
> The average slowdown of msan is 3x (ranges between 2x and 4x).
> When running under msan+DynamoRio (msandr) the average slowdown is 3.6x
>
> # benchmark  native-time(seconds) msan-slowdown(times)
> msan+dr-slowdown(times)
> 400.perlbench   379     3.71    4.75
> 401.bzip2       483     3.02    2.93
> 403.gcc         312     2.91    4.22
> 429.mcf         313     2.97    2.97
> 445.gobmk       429     3.32    6.41
> 456.hmmer       606     3.69    2.55
> 458.sjeng       491     3.98    6.39
> 462.libquantum  446     2.72    2.18
> 464.h264ref     573     3.61    5.13
> 471.omnetpp     317     2.16    2.91
> 473.astar       412     2.26    2.32
> 483.xalancbmk   241     3.11    4.99
> 433.milc        402     2.25    2.31
> 444.namd        374     2.53    2.22
> 447.dealII      311     2.82    3.39
> 450.soplex      232     2.60    2.80
> 453.povray      197     3.35    5.28
> 470.lbm         321     2.10    2.28
> 482.sphinx3     496     4.00    3.35
>
> Funny fact: sometimes the DynamoRio tool makes things faster due to "trace
> optimization". We could probably fix this in the LLVM pass.
>
> Thanks,
> --kcc
>
>
>
> On Thu, Jul 19, 2012 at 7:42 PM, Evgeniy Stepanov <
> eugeni.stepanov at gmail.com> wrote:
>
>> We've solved the issue of external code with the help of dynamic
>> binary instrumentation. A simple (~500LOK) DynamoRio client zero-fills
>> shadow memory for all stores and return values in uninstrumented code,
>> thus avoiding false positives.
>>
>> I'm attaching the new patch to LLVM with a slightly improved
>> instrumentation pass. Repository at
>> http://code.google.com/p/memory-sanitizer/source/browse/ contains the
>> runtime library and the DynamoRio client.
>>
>> In its current state, the tool can handle all C and C++ tests in SPEC
>> CPU2006, finding at least one use of uninitialized data that's also
>> detected by Memcheck (Valgrind), and one or two false positives coming
>> from our inexact handling of certain instructions.
>>
>> See the performance numbers. "MSan time" was measured by running
>> without the dynamic tool, initializing shadow to 0x0 instead of 0xFF
>> to avoid reporting any errors. "MSan/DR time" is from an honest run of
>> an instrumented benchmark under the DynamoRio-based tool.
>>
>> #    test,     native time,   MSan time,   MSan slowdown,    MSan/DR
>> time,    MSan/DR slowdown
>>   400.perlbench,        22.40,       116.00,         5.18
>> 138.00,         6.16
>>       401.bzip2,        41.00,       155.00,         3.78
>> 149.00,         3.63
>>         403.gcc,         0.79,         3.52,         4.48
>> 10.50,        13.38
>>         429.mcf,        13.80,        37.00,         2.68
>> 46.70,         3.38
>>       445.gobmk,        83.10,       328.00,         3.95
>> 440.00,         5.29
>>       456.hmmer,        59.70,       172.00,         2.88
>> 145.00,         2.43
>>       458.sjeng,       105.00,       305.00,         2.90
>> 391.00,         3.72
>>  462.libquantum,         1.66,         4.37,         2.63
>> 3.88,         2.34
>>     464.h264ref,        71.30,       267.00,         3.74
>> 358.00,         5.02
>>     471.omnetpp,        42.20,       246.00,         5.83
>> 436.00,        10.33
>>       473.astar,        82.40,       243.00,         2.95
>> 285.00,         3.46
>>   483.xalancbmk,        48.40,       644.00,        13.31
>> 1164.00,        24.05
>>        433.milc,        14.00,        37.60,         2.69
>> 40.80,         2.91
>>        444.namd,        10.20,        33.20,         3.25
>> 34.20,         3.35
>>      453.povray,         6.50,        32.20,         4.95
>> 48.80,         7.51
>>         470.lbm,        31.00,       107.00,         3.45
>> 138.00,         4.45
>>     482.sphinx3,         7.66,        23.90,         3.12
>> 26.80,         3.50
>>
>>
>> On Wed, Jun 27, 2012 at 10:50 PM, Kostya Serebryany <kcc at google.com>
>> wrote:
>> >
>> >
>> > On Wed, Jun 27, 2012 at 7:30 PM, John Criswell <criswell at illinois.edu>
>> > wrote:
>> >>
>> >> On 6/26/12 11:20 AM, Kostya Serebryany wrote:
>> >>
>> >> John,
>> >>
>> >> On Tue, Jun 26, 2012 at 7:45 PM, John Criswell <criswell at illinois.edu>
>> >> wrote:
>> >>>
>> >>> On 6/26/12 4:18 AM, Kostya Serebryany wrote:
>> >>>
>> >>> Updated the patch to use the new interface for TLS
>> >>> (GlobalVariable::GeneralDynamicTLSModel).
>> >>> Comments?
>> >>>
>> >>>
>> >>> First, there's the non-technical question of whether this belongs in
>> LLVM
>> >>> mainline or not in its current state.  Do you have users of this tool
>> in the
>> >>> same way that you had users of ASan?  What platforms have you tested
>> this
>> >>> tool on?  Do you get false positives when you link with native code
>> >>> libraries, and if so, is this false positive rate low?  I get the
>> impression
>> >>> that this tool isn't as mature as TSan or ASan,
>> >>
>> >>
>> >> Your impression is correct -- msan is not yet usable. It works only on
>> >> linux-x86_64, only on small libc-free tests.
>> >> It has been heavily tested (on SPEC and other large programs), but w/o
>> >> actually poisoning the shadow.
>> >>
>> >>>
>> >>> so I'm wondering whether it's really ready to go in yet.
>> >>
>> >>
>> >> Having these 500 LOC in trunk will simplify further development a lot.
>> >> (we will also need a few lines of driver changes).
>> >> I can't tell whether this justifies the commit to trunk now (it
>> certainly
>> >> does for me).
>> >>
>> >>
>> >> I think it would be better to fix the external code issue first before
>> >> trying to commit to LLVM trunk; it's the only obstacle I see that
>> prevents
>> >> you from having a working tool.  That way, you can have a working tool
>> and
>> >> get feedback on it; you'll also know what the ramifications are for
>> your
>> >> approach to handling external library code.  As it is right now, I
>> think
>> >> this new tool is still in too early a stage to add to LLVM.
>> >
>> >
>> > Having this code in trunk will simplify and speedup the development of
>> the
>> > dynamic component(s).
>> > It may also inspire some research in the area of LLVM-based binary
>> > instrumentation.
>> > But I'll agree on what the community says.
>> >
>> >
>> >>
>> >>
>> >>
>> >>
>> >>>
>> >>>
>> >>> Second, on the technical side, I think dealing with the external code
>> >>> problem has to be addressed before the code can be committed; I
>> suspect the
>> >>> tool will return too many false positives if you don't address it (if
>> you
>> >>> have experiments indicating otherwise, please let us know).  I'm
>> assuming
>> >>> from the comment at the beginning of the pass's implementation that
>> it's
>> >>> still unaddressed.
>> >>
>> >>
>> >> We are working on this, in fact experimenting in a few different
>> >> directions (DynamoRIO, PIN, static instrumentation, even full
>> recompile).
>> >>
>> >>
>> >> Okay.
>> >>
>> >>
>> >>
>> >>>
>> >>> I think instrumenting all library code is unrealistic; you'd have to
>> have
>> >>> every library on your system compiled with the sanitizer (at least
>> for large
>> >>> GUI applications).
>> >>
>> >>
>> >> It may still be doable in some restricted environments, but yes, I
>> agree
>> >> in general.
>> >>
>> >>>
>> >>> I think there's a few other approaches that you can take:
>> >>>
>> >>> 1) Assume external code works properly.  To do this, you first ensure
>> >>> that memory allocated by external code is always marked as
>> initialized.
>> >>> Second, when passing memory objects from internal, instrumented code
>> to
>> >>> external library code, you mark the memory as initialized (since
>> you're
>> >>> assuming the external code will initialize it).
>> >>
>> >>
>> >> That doesn't work.
>> >> What if I pass my pointer to a library function.
>> >> What has to be un-poisoned (marked as inited)?
>> >> The contents of the heap/stack/bss block to which this pointer belongs?
>> >>
>> >>
>> >> Yes, that's what I was thinking.
>> >>
>> >>
>> >> How about pointers that reside in this block? Do we need to go deeper?
>> >>
>> >>
>> >> Ah, that's a problem.  You might be able to do this conservatively by
>> >> treating values that point into valid memory objects as pointers (like
>> >> conservative garbage collection), but it'd be expensive and probably
>> not
>> >> worth the trouble.
>> >
>> >
>> > It'll be prohibitively expensive, if at all possible.
>> > Even with all the hacks we put into the allocator code, checking
>> whether a
>> > given integer is a currently addressable pointer is not free. For stack
>> > buffers that's even more expensive.
>> > Some pointers may point slightly off the buffer (e.g. to -1'th element).
>> >
>> >>
>> >>
>> >>
>> >>
>> >>>
>> >>>
>> >>> Of course, you can do things like special casing the C library to
>> reduce
>> >>> false negatives, and you can instruct users to compile library code
>> with the
>> >>> compiler to find more bugs.
>> >>>
>> >>> 2) Use something like SAFECode's CompleteChecks pass to determine
>> which
>> >>> checks are on memory objects handled completely within the analyzed
>> portion
>> >>> of the program and which checks can access memory handled by external
>> code.
>> >>> Checks on memory objects handled by external code always pass (or just
>> >>> generate a warning instead of an error, or something like that).  This
>> >>> approach requires a good points-to analysis algorithm.  SAFECode uses
>> DSA
>> >>> which is unification-based.  You can either use that or, if you
>> prefer, use
>> >>> a subset based algorithm like Anderson's.  I think there's one
>> >>> implementation of Anderson's for LLVM out there, but I don't know how
>> robust
>> >>> it is.
>> >>
>> >>
>> >> I'll have to look at this more, but I don't (yet) see how it can help.
>> >> Again, suppose
>> >>
>> >> struct A { int stuff; }
>> >> struct B { A *a; }
>> >> struct C { A *a, B* b};
>> >> ...
>> >> struct Z { A* a, B *b, C *c, ... Y *y;}
>> >>
>> >> ...
>> >> Z *z = new Z;
>> >> library_function(z);
>> >>
>> >> library_function can write to any part of Z, but we can't even traverse
>> >> all of it.
>> >> Can we do anything?
>> >>
>> >>
>> >> No, you can't really say anything about the object to which Z points or
>> >> the objects reachable from the pointer Z.  Any checks on z, z->a, etc.
>> would
>> >> have to conservatively pass.
>> >
>> >
>> >
>> >>
>> >> This approach doesn't help you check for initialization errors with
>> >> library code; it helps prevent library code from triggering false
>> positives.
>> >> It ensures that all of your checks are performed on memory objects
>> handled
>> >> by internal code.
>> >>
>> >> For memory safety, relaxing the safety guarantees for memory objects
>> >> passed to library code still results in a usable system that catches
>> memory
>> >> safety errors.  I don't know if the same is true for uninitialized
>> value
>> >> detection; you'll probably have to experiment and see.
>> >
>> >
>> > That's not true for uninitialized value detection.
>> >
>> > both asan and tsan know nothing about the code in the libraries and work
>> > just fine.
>> > They miss some percentage of bugs (based on experience with valgrind our
>> > estimate is 5%-10% of bugs).
>> > Adding a dynamic component to asan/tsan will add that coverage, and we
>> are
>> > going to do it.
>> >
>> > For msan the story is completely different.
>> > You either cover everything including system calls or don't go there.
>> > Any memory write which the tool does not observe followed by a read in
>> the
>> > instrumented code leads to a false report.
>> >
>> >
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>>
>> >>>
>> >>> 3) Use dynamic binary instrumentation to handle external code.
>> >>>
>> >>> The first option is probably the easiest to implement.  The second
>> option
>> >>> requires a points-to analysis that can go into LLVM mainline.  The
>> third
>> >>> option is a fair amount of work but probably easier than using
>> points-to
>> >>> analysis and will probably yield a better debugging tool.
>> >>
>> >>
>> >> We *are* working on #3 (see above).
>> >>
>> >>
>> >> I think using dynamic binary translation is probably the best option
>> out
>> >> of the three.  Just take my opinion with the caveat that I haven't run
>> any
>> >> experiments on uninitialized value usage.
>> >
>> >
>> > It's actually the worst option, but the only one I know how to
>> implement.
>> > I'd really prefer a fully static approach (static binary translation, or
>> > full recompile) but I don't see any suitable technology. :(
>> >
>> >>>
>> >>>
>> >>> Third, other technical nit-picks:
>> >>>
>> >>> 1) IMHO, in general, every method and function should have a comment
>> >>> explaining what it does and the meaning of its inputs and return
>> values.
>> >>> Descriptions should use complete sentences.
>> >>
>> >>
>> >> Agree, I'll add comments.
>> >>
>> >>>
>> >>>
>> >>> 2) You may want to consider using a class derived from InstVisitor
>> >>> instead of writing large if/then/else or switch code.
>> >>
>> >>
>> >> I Didn't know about InstVisitor.
>> >> However, I am not sure if that would be simpler. Will have a closer
>> look.
>> >>
>> >>
>> >> InstVisitor is faster than using dyn_cast<>() and is the same as using
>> a
>> >> switch on the instruction opcode (which you code does in some places).
>>  It
>> >> may make your code more readable.
>> >
>> >
>> > I am convinced. The new patch uses InstVisitor.
>> >
>> >>
>> >>
>> >>
>> >>
>> >>>
>> >>>
>> >>> 3) If I understand correctly, a "shadow instruction" propagates
>> >>> information about initialization status through a computation (so, for
>> >>> example, if you have a c = add a, b instruction, there's a shadow
>> >>> instruction that computes c's initialization status based on a and b's
>> >>> initialization status).
>> >>
>> >>
>> >> Correct.
>> >>
>> >>>
>> >>> It looks like you add shadow instructions for all instructions and
>> then
>> >>> "fill in" the operands to the shadow instructions by iterating over
>> the
>> >>> instructions a second time.  Is this correct?
>> >>
>> >>
>> >> Yes.
>> >>
>> >>
>> >> How do you handle passing taint (i.e., initialization status) between
>> >> actual parameters and formal parameters in a function call?  Do you
>> store
>> >> the information to memory and then read it in the callee?
>> >
>> >
>> > Yes, the shadow is passed via a dedicated thread-local memory slot.
>> >
>> >>
>> >>
>> >>
>> >>
>> >>>
>> >>> If so, there's an alternative approach: if you fetch the dominator
>> tree
>> >>> for the function and visit instructions in dominator tree order, then
>> you'll
>> >>> visit all definitions before all uses.  That will alleviate the need
>> to make
>> >>> two passes over all the instructions.  The only tricky part is phi
>> nodes,
>> >>> but for those, you add shadow instructions (phi nodes with undef
>> operands)
>> >>> to them first and then visit all the other instructions in dominator
>> tree
>> >>> order; you fill in the operands of the shadow phi nodes at the end.
>> >>>
>> >>> I'm not sure which approach is faster (I don't think LLVM's dominator
>> >>> tree pass is the most efficient implementation), but it's something
>> to think
>> >>> about.
>> >>
>> >>
>> >> I actually did think about this. I did not experiment with using the
>> >> dominator tree, but I think that the current way is at least as
>> efficient.
>> >>
>> >>
>> >> It's really a question of whether computing the dominator tree is
>> faster
>> >> or slower than iterating over all instructions twice (or whether
>> dominator
>> >> tree has already been computed for some previously run transform).  I
>> >> suppose dominator tree might be better if you're not going to
>> instrument all
>> >> instructions.
>> >
>> >
>> > Unless I am mistaken depth-first-order is enough.
>> > The new patch uses that.
>> >
>> >
>> > Thanks!
>> >
>> > --kcc
>> >
>> > _______________________________________________
>> > llvm-commits mailing list
>> > llvm-commits at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120727/543e64b7/attachment.html>