[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

Daniel Berlin dberlin at dberlin.org
Sun Jun 14 22:43:46 PDT 2015


On Sun, Jun 14, 2015 at 6:35 PM, Christian Convey
<christian.convey at gmail.com> wrote:
>> > The algorithm maintains a may-point-to graph.  Unfortunately the
>> > algorithm
>> > doesn't delete an "A-->B" edge when there's a strong update of "A" but
>> > the
>> > value copied into "A" isn't a pointer.  So the interpretation of "A"
>> > having
>> > only one outbound edge (to "B") is a little ambiguous.  It means "'A'
>> > definitely points to 'B', or 'A' doesn't hold a valid pointer."
>>
>>
>> Define "valid pointer please"?
>
>
> Sorry, I can see how my phrasing raised a red flag.
>
> The original version of the algorithm I'm looking at was designed to analyze
> C source code, not LLVM IR.  I'm in the process of adapting its dataflow
> equations for IR.
>
> The algorithm assumes that a correct C program can't just compute pointer
> values ex nihilo; that they can only by obtained from certain syntactic
> structures like variable declarations, or calls to malloc, or pointer
> literals.

While true in theory, this is 100% wrong in practice ;)
>  The AA algorithm reckons that dereferencing a runtime value
> obtained by some other mechanism is so likely to be a bug, that they can
> skip worrying about it.

Given that things like "calculating vtable addresses", etc, end up
looking indistinguishably the same at the IR level, you can't really.

>
> The AA algorithm uses dataflow analysis to monitor the possible propagation
> of those values through the program code, and it represents those flows by
> updates to the may-point-to graph.  If at some code point CP, a may-point-to
> graph vertex "B" has no outbound edges, that's equivalent to saying that the
> AA has concluded the runtime memory modeled by "B" does not contain any
> pointer that a correct program has any business trying to dereference.

FWIW: When i first did GCC's current points-to analysis, I did the
same thing. It eliminated "non-pointer" values along the same lines.
This broke roughly "the entire world".

I tried to find some subset i felt was worthwhile and where it was
okay, but gave up after a while.



More information about the llvm-dev mailing list