[llvm-dev] [RFC] Design of a TBAA sanitizer

Tue Apr 11 13:30:09 PDT 2017

On Tue, Apr 11, 2017 at 1:25 PM, Sanjoy Das <sanjoy at playingwithpointers.com>
wrote:

> Hi,
>
> On April 11, 2017 at 11:55:12 AM, Kostya Serebryany via llvm-dev
> (llvm-dev at lists.llvm.org) wrote:
> > Evgeniy and I recently discussed something similar for detecting bad
> casts
> > (code named: TypeSanitizer).
> > The approach with the shadow memory looked attractive at the first
> glance,
> > but then we've drowned in details.
> >
> > Specifically for TBAA, I had another idea, not involving shadow memory.
> > Consider LLVM queries MayAlias(P1, P2) and the result is false, i.e. we
> > think that P1 and P2 point to disjoint memory regions.
> > Then we insert a run-time check that intervals [P1,sizeof(*P1)) and
> > [P2,sizeof(*P2)) don't intersect.
> >
> > For functions with a reasonable number of pointer pairs where
> MayAlias(P1,
> > P2)==false we could insert checks for all such pairs.
>
> I'm not very clear on how this will fit into TBAA -- I don't think
> TBAA decides aliasing relation between pointers, but it decides
> aliasing relation between accesses (!tbaa metadata only exists on
> accesses).
>

of course, but accesses are done via pointers, and if TBAA queries
MayAlias(AccessViaP1, AccessViaP2)
there should (??) be a point in the IR where both P1 and P2 exist together
and can be compared.

>
> This means, at least in LLVM IR, you have to account for cases like:
>
> if (A)
>   *(int *)ptr = 20;
> if (B)
>   float f = *(float *)ptr;
>
> where ptr does not have a type (according to TBAA), but you want to
> crash the program if both A and B are true.
>
> -- Sanjoy
>
> > For larger functions -- only for those pairs where the optimizer actually
> > queried MayAlias(P1, P2).
> >
> > --kcc
> >
> >
> > On Tue, Apr 11, 2017 at 3:49 AM, Hal Finkel via llvm-dev <
> > llvm-dev at lists.llvm.org> wrote:
> >
> > >
> > > On 04/11/2017 03:46 AM, Andrey Bokhanko wrote:
> > >
> > > Hal,
> > >
> > > To clarify, my example meant to illustrate that for memory references
> to
> > > structures' fields you have to keep a user-defined type, even for one
> byte
> > > accesses. C++ allows references to "initial member sequence" using
> pointers
> > > to structures of different types. And yes, there are programs in the
> wild
> > > that rely on this (don't remember details... this was from previous
> life).
> > >
> > > Another thing to consider -- what about memset / memcpy / etc that
> > > inherently rely on type punning? If not inlined, they don't present
> > > problems for an optimizer -- probably shouldn't be considered as
> aliasing
> > > rules violation?
> > >
> > >
> > > Good point. You (likely) wouldn't want to compile your memcpy / memset
> /
> > > etc. implementations with the TBAA sanitizer enabled (or we'd need to
> > > provide an attribute to disable it for certain functions). If they
> only use
> > > char* themselves, then it's fine, but if not, that's implementation
> magic.
> > > However, memset, etc. does need to be able to 'unset' the type of
> memory
> > > and memcpy needs to be able to copy types (at least for PODs). The
> > > sanitizer would need to hook them for that purpose.
> > >
> > > -Hal
> > >
> > >
> > >
> > > Yours,
> > > Andrey
> > > ===
> > > Compiler Architect
> > > NXP
> > >
> > >
> > > On Tue, Apr 11, 2017 at 12:05 AM, Hal Finkel wrote:
> > >
> > >>
> > >> On 04/10/2017 09:55 AM, Andrey Bokhanko wrote:
> > >>
> > >> Hi Hal,
> > >>
> > >> I wonder how your solution will handle the following?
> > >>
> > >> struct {
> > >> int s1_f1;
> > >> float s1_f2;
> > >> int s1_f3;
> > >> float s1_f4;
> > >> } S1;
> > >>
> > >> struct {
> > >> int s2_f1;
> > >> float s2_f2;
> > >> int *s2_f3; // to add some interest, suppose that sizeof(int) ==
> > >> sizeof(int *)
> > >> float s2_f4;
> > >> } S2;
> > >>
> > >> S1 *s1; S2 *s2;
> > >> ...
> > >> s2 = (S1*)s1;
> > >> s2->s2_f1 = 0; // allowed
> > >> s2->s2_f2 = 0; // allowed
> > >> s2->s2_f3 = 0; // not allowed
> > >> s2->s2_f4 = 0; // not allowed
> > >>
> > >> Also, when you plan to set types for allocated memory?
> > >>
> > >>
> > >> The most-general thing seems to be to set the types along with a
> store.
> > >> As a result, the proposed scheme would not find a fault with the code
> > >> above, but would complain if anyone actually later read S1.s1_f3.
> > >>
> > >> If we want to catch these kinds of problems directly we'd need to have
> > >> the compiler insert code when the type is constructed to mark the
> types,
> > >> and then we'd need to check those types around stores. This also
> sounds
> > >> like a useful enhancement (although somewhat more complicated to
> implement).
> > >>
> > >> What types will be set for memory allocated by a malloc call?
> > >>
> > >>
> > >> Memory would be untyped (or of unknown type) when allocated.
> > >>
> > >> Thanks again,
> > >> Hal
> > >>
> > >>
> > >>
> > >> Yours,
> > >> Andrey
> > >>
> > >>
> > >> On Tue, Apr 4, 2017 at 10:13 PM, Hal Finkel via llvm-dev <
> > >> llvm-dev at lists.llvm.org> wrote:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> At EuroLLVM, Chandler and I chatted about the design for a potential
> > >>> TBAA sanitizer. Here's my attempt to summarize:
> > >>>
> > >>> C/C++ have type-based aliasing rules, and LLVM's optimizer can
> exploit
> > >>> these given TBAA metadata added by Clang. Roughly, a pointer of
> given type
> > >>> cannot be used to access an object of a different type (with, of
> course,
> > >>> certain exceptions). Unfortunately, there's a lot of code in the
> wild that
> > >>> violates these rules (e.g. for type punning), and such code often
> must be
> > >>> built with -fno-strict-aliasing. Performance is often sacrificed as a
> > >>> result. Part of the problem is the difficulty of finding TBAA
> violations. A
> > >>> sanitizer would help.
> > >>>
> > >>> A design goal of a TBAA sanitizer is to limit the shadow-memory
> overhead
> > >>> of the implementation. ASan, for example, uses 1 bit per byte. Here
> we're
> > >>> hoping to keep the overhead down to 2 bits per byte for the TBAA
> > >>> sanitizing. We might be able to do this, while handling all common
> types on
> > >>> the fast path, if we use both alignment and type information. When
> > >>> accessing data of B bytes, 2*B bits of shadow memory can be used.
> Thus,
> > >>> we'll get 2 bits for a one-byte type, 4 bits for a two-byte type,
> etc.
> > >>> Moreover, we need appropriate holes in the encoding space so that no
> type
> > >>> has a shadow encoding that overlaps with an aligned part of a larger
> type's
> > >>> encoding.
> > >>> For example, we need to detect:
> > >>>
> > >>> double f = ...; return *(int*) &f; // We should catch this.
> > >>>
> > >>> We might use the following encoding. The idea is that the common
> case,
> > >>> for which we need a reasonable fast path, is that type types are
> exactly
> > >>> equal. For this case, we want a simple comparison of the shadow type
> > >>> encodings to be sufficient to validate the access. For cases where
> the
> > >>> encodings don't match (and isn't zero to indicate an unknown type),
> or for
> > >>> which there is no direct encoding for the access type, a slow path
> must be
> > >>> used. All of this assumes that we're validating the the pointer
> alignment
> > >>> first, and then checking the type encodings.
> > >>>
> > >>> 1 Byte:
> > >>> 00 = 0 = unknown type
> > >>> 01 = 1 = hole
> > >>> 10 = 2 = hole
> > >>> 11 = 3 = all one-byte types (slow path, see note later on this)
> > >>>
> > >>> 2 Bytes:
> > >>> 0000 = 0 = unknown type
> > >>> 0101 = 5 = short
> > >>> 0110 = 6 = hole (A)
> > >>> 0111 = 7 = wchar_t (under some ABIs)
> > >>> 1001 = 9 = hole (B)
> > >>> 1010 = 10 = hole (C)
> > >>> 1011 = 11 = char16_t
> > >>> 1111 = 15 = all other types (slow path)
> > >>>
> > >>> It is important here to have wchar_t have a direct encoding here
> because
> > >>> wchar_t is two bytes on Windows, and moreover, wchar_t is very
> commonly
> > >>> used on Windows. The partial encoding overlap of wchar_t (i.e. 0111)
> with
> > >>> the 11 one-byte-type encoding works because 11 always indicates a
> slow-path
> > >>> check.
> > >>>
> > >>> 4 Bytes:
> > >>> 0000 0000 = 0 = unknown type
> > >>> A A = int
> > >>> A B = float
> > >>> B A = pointer (under some ABIs)
> > >>> B B = long (under some ABIs)
> > >>> A 1111 = wchar_t (under some ABIs)
> > >>> B 1111 = char32_t
> > >>> A C = hole (D)
> > >>> C A = hole (E)
> > >>> B C = hole (F)
> > >>> C B = hole (G)
> > >>> C C = hole (H)
> > >>> 1111 1111 = 255 = all other types (slow path)
> > >>>
> > >>> 8 Bytes:
> > >>> 0000 0000 0000 0000 = 0 = unknown type
> > >>> D D = double
> > >>> D E = long (under some ABIs)
> > >>> E D = long long (under some ABIs)
> > >>> E E = long double (under some ABIs)
> > >>> D F = pointer (under some ABIs)
> > >>> F D = hole (I)
> > >>> E F = hole (J)
> > >>> F E = hole
> > >>> F F = hole
> > >>> ...
> > >>> 1111 1111 1111 1111 = 65535 = all other types
> > >>>
> > >>> 16 Bytes:
> > >>> 0 = unknown type
> > >>> | | = __int128_t
> > >>> I J = long long (under some ABIs)
> > >>> J I = long double (under some ABIs)
> > >>> J J = hole
> > >>> ...
> > >>> -1 = all other types
> > >>>
> > >>> For pointers, this scheme would consider all pointers to be the same
> > >>> (regardless of pointee type). Doing otherwise would mostly requiring
> > >>> putting pointer-type checking on the slow path (i.e. access via a
> pointer
> > >>> pointer), and that could add considerable overhead. We might,
> however,
> > >>> split out function pointers from other pointers. We could provide a
> > >>> compile-time option to control the granularity of pointer-type
> checks.
> > >>>
> > >>> Builtin vector types for which the vector element type has a direct
> > >>> encoding also naturally have a direct encoding (the concatenation of
> the
> > >>> encoding for the element type).
> > >>>
> > >>> Obviously the fact that we have no fast-path encodings for one-byte
> > >>> types could be problematic. Note however that:
> > >>>
> > >>> 1. If a larger type is being used to access a smaller type (plus
> more),
> > >>> the encodings won't match, so we always end up on the slow path.
> > >>>
> > >>> 2. If the access type is a one-byte type, we would want to validate
> > >>> quickly. However, most common one-byte types are universally
> aliasing (i.e.
> > >>> not subject to TBAA violations). Specifically, for C/C++, pointers
> to char,
> > >>> unsigned char, signed char (C only), and std::byte, can be used to
> access
> > >>> any part of any type. That leaves signed char (C++ only),
> bool/_Bool, and
> > >>> enums with a [signed/unsigned] char base type (C++ only, std::byte
> > >>> exempted) as pointee types we might wish to validate. We'd always
> need to
> > >>> fall back to the slow path to validate these. We could provide a
> > >>> compile-time option to disable such one-byte access checking if
> necessary.
> > >>>
> > >>> How would the slow path work? First, the code needs to find the
> > >>> beginning of the allocation. It can do this by scanning backwards in
> the
> > >>> ASan shadow memory. Once located, we'll read a pointer to a
> > >>> type-description structure from that "red zone" location. For dynamic
> > >>> allocations, ASan's allocator will ensure such a space for the
> pointer
> > >>> exists. For static allocations and globals, the compiler will ensure
> it
> > >>> exists. The compiler will make sure that all constructors locate
> this field
> > >>> and fill it in. Destructors can clear it. If two of these
> > >>> type-description-structure pointers are equal, then we can conclude
> that
> > >>> the types are equal. If not, then we need to interpret the
> structure. The
> > >>> pointer itself might be to an interval map (to deal with arrays,
> placement
> > >>> new, etc. - we can use the low bit of the pointer to differentiate
> between
> > >>> an actual type-description structure and an interval map), and the
> leaves
> > >>> of the interval map point to actual type-description structures. The
> > >>> type-description structure is an array of (offset, type) pairs,
> where the
> > >>> type field is also a type-description-structure pointer. The
> > >>> type-description structures themselves are comdat globals emitted in
> each
> > >>> relevant translation unit, where the comdat key is formed using the
> mangled
> > >>> type name (and size, etc.), and pointers to these symbols are then
> used to
> > >>> identify the types.
> > >>>
> > >>> Thoughts?
> > >>>
> > >>> Thanks again,
> > >>> Hal
> > >>>
> > >>> --
> > >>> Hal Finkel
> > >>> Lead, Compiler Technology and Programming Languages
> > >>> Leadership Computing Facility
> > >>> Argonne National Laboratory
> > >>>
> > >>> _______________________________________________
> > >>> LLVM Developers mailing list
> > >>> llvm-dev at lists.llvm.org
> > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >>>
> > >>
> > >>
> > >> --
> > >> Hal Finkel
> > >> Lead, Compiler Technology and Programming Languages
> > >> Leadership Computing Facility
> > >> Argonne National Laboratory
> > >>
> > >>
> > >
> > > --
> > > Hal Finkel
> > > Lead, Compiler Technology and Programming Languages
> > > Leadership Computing Facility
> > > Argonne National Laboratory
> > >
> > >
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >
> > >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/8f8606ca/attachment.html>