[PATCH] D114988: [IR] `GetElementPtrInst`: per-index `inrange` support
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 2 13:17:41 PST 2021
lebedev.ri requested review of this revision.
lebedev.ri added a comment.
In D114988#3167963 <https://reviews.llvm.org/D114988#3167963>, @nikic wrote:
> While I support the general goal of exposing GEP offset restrictions to IR,
> I am quite strongly opposed to the implementation approach of extending `inrange`.
> The core issue is that this is strongly tied to LLVM struct types and structural GEP indexing.
> This will be a blow to opaque pointer usefulness and future offset canonicalization for GEPs.
While i'm certainly sympathetic to the opaque pointer future,
i'd also like to remind that they are just a tool.
Concretely, can you quote anything that says that in the opaque pointer future,
the only GEP that will remain will only be able to apply a byte offset to the pointer,
i.e. there won't be GEP's into structs/multiple indices?
> I think the correct approach to `inrange`-like information is to restrict the range of GEP indices
> without relying on the underlying structure type. Think `inrange(0, 4) i32 %x` for `%x` between 0 and 4.
> This naturally integrates in purely offset-based alias analysis, and can be more generally preserved under transformation.
> For example, if you have `gep %base, inrange (%x + 1)`, if this is transformed into `gep (gep %base, 1), %x`
> there is no way to preserve `inrange` information under the current proposal,
> while a proper offset-based approach could easily retain information under simple transformations.
I think you are missing the whole point there. It is explicitly **NOT** the point of this patch
to be able encode that some index must take values in range of [x, y). So if that is your proposal,
while it may be interesting, it's explicitly inferior, and does not solve the motivational case.
The reason being, it encodes *VERY* different semantics.
If we've encoded that in
struct S {
int a[3];
int b[3];
int c[3];
};
void bar(int*);
void foo(S* s, int i) {
int* p = &s.b[i];
bar(p);
int* p2 = p + 4; // UB!
bar(p);
}
the variable `i` of function `foo` must be `[0, 3)`,
that only tells us that `p` is pointing somewhere in `(int*)s + 3 + [0, 3)`.
What it does not encode is that, given that pointer, we can not go outside of that array,
i.e. that `auto* p2 = p + 4;` is UB.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114988/new/
https://reviews.llvm.org/D114988
More information about the llvm-commits
mailing list