[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)
Hal Finkel via cfe-commits
cfe-commits at lists.llvm.org
Mon May 1 12:16:19 PDT 2017
On 05/01/2017 12:49 PM, Daniel Berlin wrote:
>
>
> On Fri, Apr 21, 2017 at 4:03 AM, Hal Finkel via Phabricator
> <reviews at reviews.llvm.org <mailto:reviews at reviews.llvm.org>> wrote:
>
> hfinkel added a comment.
>
> In https://reviews.llvm.org/D32199#732737
> <https://reviews.llvm.org/D32199#732737>, @rsmith wrote:
>
> > In https://reviews.llvm.org/D32199#732189
> <https://reviews.llvm.org/D32199#732189>, @hfinkel wrote:
> >
> > > In https://reviews.llvm.org/D32199#731472
> <https://reviews.llvm.org/D32199#731472>, @rsmith wrote:
> > >
> > > > 1. C's "effective type" rule allows writes to set the type
> pretty much unconditionally, unless the storage is for a variable
> with a declared type
> > >
> > >
> > > To come back to this point: We don't really implement these
> rules now, and it is not clear that we will. The problem here is
> that, if we take the specification literally, then we can't use
> our current TBAA at all. The problem is that if we have:
> > >
> > > write x, !tbaa "int"
> > > read x, !tbaa "int"
> > > write x, !tbaa "float"
> > >
> > >
> > > TBAA will currently tell us that the "float" write aliases
> with neither the preceding read nor the preceding write.
> >
> >
> > Right, C's TBAA rules do not (in general) permit a store to be
> reordered before a memory operation of a different type, they only
> allow loads to be moved before stores. (Put another way, they do
> not tell you that pointers point to distinct memory locations,
> just that a stored value cannot be observed by a load of a
> different type.) You get the more general "distinct memory
> locations" result only for objects of a declared type.
> >
> > C++ is similar, except that (because object lifetimes do not
> currently begin magically due to a store) you /can/ reorder stores
> past a memory operation of a different type if you know no
> object's lifetime began in between. (But currently we do not
> record all lifetime events in IR, so we can't do that today. Also,
> we may be about to lose the property that you can statically
> determine a small number of places that might start an object
> lifetime.)
> >
> > > Also, a strict reading of C's access rules seems to rule out
> the premise underlying our struct-path TBAA entirely. So long as
> I'm accessing a value using a struct that has some member,
> including recursively, with that type, then it's fine. The
> matching of the relative offsets is a sufficient, but not
> necessary, condition for well-defined access. C++ has essentially
> the same language (and, thus, potentially the same problem).
> >
> > I agree this rule is garbage, but it's not as permissive as I
> think you're suggesting. The rule says that you can use an lvalue
> of struct type to access memory of struct field type. In C this
> happens during struct assignment, for instance. It does *not*
> permit using an lvalue of struct field type to access unrelated
> fields of the same struct. So C appears to allow this nonsense:
> >
> > char *p = malloc(8);
> > *(int*)p = 0;
> > *(int*)(p + 4) = 0;
> > struct S {int n; float f;} s = *(struct S*)p; // use lvalue of
> type `struct S` to access object of effective type `int`, to
> initialize a `float`
> >
> >
> > but not this nonsense:
> >
> > float q = ((struct S*)p)->f; // ub, cannot use lvalue of type
> `float` to access object of effective type `int`
> >
> >
> > ... which just means that we can't make much use of TBAA when
> emitting struct copies in C.
> >
> > In C++, on the other hand, the rule is even more garbage, since
> there is no way to perform a memory access with a glvalue of class
> type. (The closest you get is that a defaulted union
> construction/assignment copies the object representation, but
> that's expressed in terms of copying a sequence of unsigned chars,
> and in any case those are member functions and so already require
> an object of the correct type to exist.) See wg21.link/cwg2051
>
>
> Our struct-path TBAA does the following:
>
> struct X { int a, b; };
> X x { 50, 100 };
> X *o = (X*) (((int*) &x) + 1);
>
> int a_is_b = o->a; // This is UB (or so we say)?
>
>
> This is UB.
> A good resource for this stuff is
> http://www.cl.cam.ac.uk/~pes20/cerberus/
> <http://www.cl.cam.ac.uk/%7Epes20/cerberus/> which has a long document
> where they exlpore all of these and what various compilers do, along
> with what the standard seems to say.
http://www.cl.cam.ac.uk/~pes20/cerberus/notes30-full.pdf is 172 pages,
and so I may have missed it, but I don't see this case. Also, I'd really
like to see where the standard says this is UB. I don't see it.
Thanks again,
Hal
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20170501/24733a08/attachment-0001.html>
More information about the cfe-commits
mailing list