[llvm-dev] RFC: Resolving TBAA issues
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Sun Aug 20 12:10:03 PDT 2017
On 08/20/2017 12:02 PM, Daniel Berlin wrote:
>
>
>
> I do not believe the current proposal will solve all of those
> cases, particularly when the fields are the same type and
> structures are compatible but they cannot overlap in C/C++ anyway.
>
> One of the threads is titled "[PATCH] D20665: Claim NoAlias if two
> GEPs index different fields of the same struct"
>
> For example, given
> struct {
> int arr_a[2];
> int arr_b[2];
> };
> assume you cannot see the original allocation site.
> in llvm ir gep(arr_b, -1) is legally an access to arr_a[1].
> You can use -1 even though it's going to be a pointer to [2 x i32].
> Thus, you can't even tell that gep(arr_a, 0) and gep(arr_b, -1) do
> not overlap without being able to know *something* about the layout of
> fields in the structure you are talking about.
Agreed (and this certainly does motivate keeping both size and offset
information for the fields). The other thing that I think it's important
to do in this respect is to record whether or not it's legal to do this
kind of inter-field indexing. In C, I believe you can always legally do
this. In C++, it is always true for standard-layout types, but
otherwise, it is up to the implementation (i.e., to whatever the
implementation allows the application of the offsetof macro). In saying
this, I'm strengthening the wording in the standard in the following
sense: The C++ rules for pointer arithmetic and safely-derived pointer
values, at least for implementations with strict pointer safety,
disallow this kind of inter-field addressing, except perhaps in the case
of two adjacent variables in standard-layout classes, for everything.
However, it's also clear that whenever you can apply the offsetof macro
all of the relative offsets are part of the semantic model of the
abstract machine, and due to practical considerations if nothing else, I
suspect we can't reasonably restrict this behavior for standard-layout
classes.
Thanks again,
Hal
>
> I'd start with: It should not require tbaa to determine that loads
> from geps that arr_a and arr_b cannot overlap. It is true regardless
> of the types involved.
>
> In terms of "who cares", Google definitely compiles with
> -fno-strict-aliasing (because third party packages are still not clean
> enough), and last i looked, Apple did the same (but i admittedly have
> not kept up).
>
> GCC can definitely disambiguate field accesses (through points-to and
> otherwise) better than LLVM in a situation where strict aliasing is off.
>
> As an aside, i also can't build a sane field-sensitive points-to on
> our current type system, because the types and structures are already
> meaningless (and we are busy making it weaker, too).
> I don't think we are going to want to tie field-sensitive points-to to
> TBAA (you definitely want to be able to run the former without the
> latter), but right now that is the only metadata you can use.
>
> Finally, the merging of TBAA is definitely going to be more
> conservative than the merging of field offset info: If we merge a load
> of an int and a float, we will, IIRC, go to the nearest common
> ancestor in TBAA. The field offset info may actually still be
> identical between the two, but we will lose it by creating/or going to
> the common ancestor.
>
>
>
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170820/74aa2673/attachment.html>
More information about the llvm-dev
mailing list