[llvm-dev] [RFC] Design of a TBAA sanitizer
Stephen Kell via llvm-dev
llvm-dev at lists.llvm.org
Wed Apr 12 03:09:01 PDT 2017
> >>A design goal of a TBAA sanitizer is to limit the shadow-memory overhead
> >>of the implementation. ASan, for example, uses 1 bit per byte. Here
> >>we're hoping to keep the overhead down to 2 bits per byte for the TBAA
> >>sanitizing. We might be able to do this, while handling all common types
> >>on the fast path, if we use both alignment and type information.
> >Slightly provocative question, but are you sure that byte-scale shadow
> >memory is a good fit?
>
> Am I sure? No. But I *think* that is because I'd like to avoid false
> positives, and that means dealing with places where we dynamically change
> the type of part of an allocation (via placement new or whatever). This
> certainly seems necessary to deal with some C++ containers at least.
(Sorry for the delayed response. Short summary is "I agree with you" but
wanted to pick some nits/details anyway. :-)
Can you elaborate on the C++ containers thing? If it's just std::variant
and things like std::any that use aligned_storage_t, then see below.
The dynamically-changing-types thing seems to work okay in both cases.
In my runtime it is possible to change the recorded type of a heap
allocation, and to create new types at run time. So there are levers to
deal with the "part of an allocation" thing, though it does get a bit
ugly. (At present, a bit more engineering is necessary to support this
trick for stack objects, e.g. to do placement new on them... but I see
no real problem supporting it.)
I can see the simplicity benefits of shadow memory, so I'm not arguing
too hard against it -- just feeling my way around the issues....
> >Just so I'm understanding: is this talking about shadowing memory with
> >its "leaf" type only, or with its "top-level" type? So if I have, say,
> >the following usually-64-bit type:
> >
> >struct compound { int x; float y; }
> >
> >... would we paint the shadow area with an alternating pattern of int
> >(0110 0110) and floats (0110 1001)? Or would we be using the "all other
> >types" thing since the memory is actually "compound"?
>
> I intended to imply that we'd fill with the alternating pattern indicating
> int and float.
Understood.
I'm now scratching my head about whether that sacrifices the ability to
detect UB where the problematic "access" is of a composite type. Passing
a struct to a function is a kind of access that comes to mind. (I'm
wavering about "object" versus "subobject", a distinction which I have
never properly understood in C11....)
> >>For pointers, this scheme would consider all pointers to be the same
> >>(regardless of pointee type). Doing otherwise would mostly requiring
> >>putting pointer-type checking on the slow path (i.e. access via a
> >>pointer pointer), and that could add considerable overhead. We might,
> >>however, split out function pointers from other pointers. We could
> >>provide a compile-time option to control the granularity of pointer-type
> >>checks.
> >Aha, the pointers-to-pointers problem. Treating all pointers the same
> >feels like a mistake because, say, aliasing a T* through a void** is
> >frequently done and really quite useful (although technically UB, if I
> >understand correctly; is this exploited in LLVM?
>
> I don't know if it is exploited by LLVM, although as I recall, all
> implementations of posix_memalign and similar functions potentially run
> afoul of the rules (because they need to store the new pointer as a void*
> regardless of what it actually is).
I think returning a pointer like posix_memalign() does it is okay -- the
client just has to declare the written-to pointer as a void*. If they
want a different type of pointer, they have to do a cast/assignment
after the call. So it's clients that are in danger of UB here... they
might use it the sloppy way, by casting a T** to void**. Of course,
*not* doing this gets tedious in some cases, e.g. when calling generic
linked-structure routines written in C.
> I'm not sure how strict we can be here
> in practice. Currently, AFAIKT, Clang emits TBAA metadata which does not
> differentiate between pointer types (it calls them all "any pointer").
Interesting -- I should take a look at this (thanks).
> >), whereas aliasing a T*
> >through an arbitrary S** is probably a bug.
> >
> >Just for contrast: I'm familiar with this problem, and my way around
> >this comes in a few parts. I check pointer creation (casts) only; checks
> >on pointer use (dereference) are unnecessary. Usually it takes a while
> >before people believe me on this, but the short story is that
> >pointer-creation checks enforce an invariant that no bad-type pointer
> >gets into circulation. So dereferences are always okay (until the first
> >warning, at least; execution can continue after a warning, in a
> >best-effort no-guarantees fashion). This also greatly helps performance.
>
> I'm sure this is true, but the problem is that C/C++ don't actually outlaw
> such pointer casts. As you also mention below, it only becomes a problem if
> such pointers are used to access an object of some incompatible type. If the
> sanitizer has false positives then it is not nearly as useful to me, even
> though the technique you describe might actually find more bugs.
Fair point. For what it's worth, I do also care a lot about minimising
false positives, and there's a trick I should have mentioned: issuing a
"trap pointer" when a bad cast occurs, to delay warnings until it's
used. On x86-64 this just means flipping some high-order bits so we get
a non-canonical address that can be losslessly converted back to the
real address. Trap bits are erased for pointer comparisons, or on casts
back to a good type. If a trap pointer is used (definitely a bug!), the
segfault handler generates a sensible warning message, decodes the
faulting instruction, "de-traps" the relevant pointer in the saved
register file, and resumes the program. (This could work a bit better
than it currently does; happy to elaborate.)
> >>Thoughts?
> >My high-level thought is the following question. Is the plan fixed on
> >developing a tool specifically for C/C++ effective-type rules, i.e.
> >focused on catching/fixing those particular cases of UB and with the
> >goal of improving user code performance? Or is there any interest in the
> >overlapping but distinct problem of helping users detect and debug
> >bad-pointer problems of a type flavour (i.e. those that are not bounds
> >errors or temporal errors)?
>
> My goal is the former, although I certainly believe there is also value in
> the latter.
Understood.
> >Plenty of pointer bugs are not UB... e.g. for any heap allocation
> >("object without a declared type", in C11-speak) I'm allowed to scribble
> >over it, thereby changing its effective type. It's only UB if I read it
> >with its previous effective type.
>
> Yes.
>
> > But the scribbling itself is fairly
> >likely to be the bug,
>
> Agreed. However, there are also legitimate use cases, and implementations of
> standard types (variant, any, etc.) may actually do this.
I agree that there is some risk of false positives here, but I think the
range of problematic code is really small and does not include the cases
you mention.
The core issue is whether type-changing writes should need to be flagged
specially in the code somehow, allowing them to be instrumented
appropriately, or whether arbitrary writes must be able to silently
change effective types. A 100% faithful adherence to standard C does
require the latter. However, I don't think C++ std::variant or std::any
are examples of code that do this, because they pretty much have to use
unions and/or placement new, which are both ways of doing the
flagging.
I do know some C code that relies on silently type-changing writes. In
fact in SPEC CPU2006 alone there are two cases -- one in bzip2, one in
lbm. The lbm case is bonkers and really should be a union. In the bzip2
case it is arguably sane; my tentative fix is to add a realloc() on the
heap array with a different type but the same size -- basically my
hacked-up C version of signalling a "placement new".
In general, relying on the "no declared type" property of heap storage
is pretty fragile, because code suddenly becomes UB if you refactor it
so that a heap object becomes a stack or static object. So asking users
to manually suppress these warnings doesn't seem unreasonable to me,
just as writing your own memcpy() would do... although no doubt opinions
differ.
(I admit I haven't tried much C++ or standard-library code yet with my
system. Also, for full disclosure: currently I do not have great support
for unions... although it errs on the side of avoiding false positives,
and I think I've figured out how it can be made to work precisely.)
> > because programmers rarely want to change the type
> >of an allocation mid-lifetime. Meanwhile, many pointer-type problems are
> >too indirect or complex for the compiler to actually do TBAA-derived
> >optimisation on, leaving little to be gained from the UB/TBAA point of
> >view... but again, plenty from the debugging perspective. I suspect the
> >biggest value of any tool, even if geared specifically towards TBAA,
> >will turn out to be largely in debugging scenarios more general than
> >effective-type UB bugs.
>
> My motivation is to help people who are currently forced to build their code
> with -fno-strict-aliasing figure out what needs to be fixed in their code so
> they don't need to do that.
Thanks for the clarification.
> This was good feedback. Your work is indeed focused on a slightly different
> problem. It could be interesting and useful to have both.
I will happily agree with this, despite the above nitpicks. :-) And (no
obligation but) I'd love to be directed towards any further
problematic/awkward code not covered by the above.
Stephen
More information about the llvm-dev
mailing list