[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?
Daniel Berlin
dberlin at dberlin.org
Sun Jun 14 22:43:46 PDT 2015
On Sun, Jun 14, 2015 at 6:35 PM, Christian Convey
<christian.convey at gmail.com> wrote:
>> > The algorithm maintains a may-point-to graph. Unfortunately the
>> > algorithm
>> > doesn't delete an "A-->B" edge when there's a strong update of "A" but
>> > the
>> > value copied into "A" isn't a pointer. So the interpretation of "A"
>> > having
>> > only one outbound edge (to "B") is a little ambiguous. It means "'A'
>> > definitely points to 'B', or 'A' doesn't hold a valid pointer."
>>
>>
>> Define "valid pointer please"?
>
>
> Sorry, I can see how my phrasing raised a red flag.
>
> The original version of the algorithm I'm looking at was designed to analyze
> C source code, not LLVM IR. I'm in the process of adapting its dataflow
> equations for IR.
>
> The algorithm assumes that a correct C program can't just compute pointer
> values ex nihilo; that they can only by obtained from certain syntactic
> structures like variable declarations, or calls to malloc, or pointer
> literals.
While true in theory, this is 100% wrong in practice ;)
> The AA algorithm reckons that dereferencing a runtime value
> obtained by some other mechanism is so likely to be a bug, that they can
> skip worrying about it.
Given that things like "calculating vtable addresses", etc, end up
looking indistinguishably the same at the IR level, you can't really.
>
> The AA algorithm uses dataflow analysis to monitor the possible propagation
> of those values through the program code, and it represents those flows by
> updates to the may-point-to graph. If at some code point CP, a may-point-to
> graph vertex "B" has no outbound edges, that's equivalent to saying that the
> AA has concluded the runtime memory modeled by "B" does not contain any
> pointer that a correct program has any business trying to dereference.
FWIW: When i first did GCC's current points-to analysis, I did the
same thing. It eliminated "non-pointer" values along the same lines.
This broke roughly "the entire world".
I tried to find some subset i felt was worthwhile and where it was
okay, but gave up after a while.
More information about the llvm-dev
mailing list