[llvm-dev] RFC: Resolving TBAA issues
Ivan A. Kosarev via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 23 07:26:15 PDT 2017
Daniel,
> GEP has no relation to original field accesses, as you know (IE
> we allow them to access negative offsets, etc)
> For a lot of these languages, more than the TBAA rules say that
> you can't just go marching through structures, etc.
So with the current approach we mix two different things: alias rules
for types and information about specific accesses, such as offsets. What
this means is, whatever we can conclude from considering a couple of
accesses represented with such a mix, it can never extend beyond the
scope of what Clang treats as a single access, that is, an expression of
the form 'p->a.b.c'. Same expression split into parts, e.g., 'p2 =
&p->a.b; p2->c', results in a less specific description of the access
and, as a consequence, in a greater number of potential false positives.
In turn, proving that 'p2' relates to 'p' is up to analyses that deal
with memory locations and not memory accesses. Looks like long-term the
current approach drives us nowhere.
If I take it correctly, purifying TBAA information from offsets means we
end up with a sort of alias sets. Then, offsets go to another metadata
tag that encode accesses in terms of constraint expressions. These tags
are supposed to be processed with what eventually should become an
implementation of the field-sensitive points-to analysis. This would
also resolve the BasicAA vs. TBAA responses issue.
I wonder if !tbaa tags for loads and stores reworked to refer to both
alias sets and constraint expressions would work as a transient format
for groping our way toward full-size field-sensitive.
Thanks,
--
More information about the llvm-dev
mailing list