[PATCH] D114988: [IR] `GetElementPtrInst`: per-index `inrange` support

Thu Dec 2 14:40:52 PST 2021

lebedev.ri added a comment.

In D114988#3168197 <https://reviews.llvm.org/D114988#3168197>, @nikic wrote:

>> I think you are missing the whole point there. It is explicitly NOT the point of this patch
>> to be able encode that some index must take values in range of [x, y). So if that is your proposal,
>> while it may be interesting, it's explicitly inferior, and does not solve the motivational case.
>
> Right, sorry. I thought you wanted to solve the same issue discussed in the llvm-dev thread, but I see that you're more interested in SROA than AA here.

Yup.

> I don't think this materially changes my point though. Rather than restricting the range of an offset,
> you are instead restricting which offsets of the GEP base can be accessed (either through this GEP,
> or a later one). That's still something that can be expressed in terms of explicit offsets rather than
> basing it on struct types.

I agree that we can express a number of cases with explicit offsets.
Presumably, we'd even be fine with having a number of these `inrange(C0, C1)` in a single GEP.
But, what happens for e.g.

  struct S {
    int array[4][4];
  };
  int& foo(S*s, int i, int j) {
    return s.array[i][j];
  }

?
We can annotate that we don't escape from the entire `array` (i.e. `[0, 128)`),
but if `i` is not a constant, how are we going to annotate the inner GEP?
Should we require that there be two GEP's there, and forbid their fusion?

> One could argue that this is really a "restrict object" operation
> that is largely independent from the GEP arithematic and could e.g. be represented as
> an intrinsic returning a restricted pointer. But I understand that encoding this in the GEP makes this more viable.

>> Concretely, can you quote anything that says that in the opaque pointer future,
>> the only GEP that will remain will only be able to apply a byte offset to the pointer,
>> i.e. there won't be GEP's into structs/multiple indices?
>
> See this thread: https://groups.google.com/g/llvm-dev/c/U7D6z7ZnKy8/m/2-xy5zPcBAAJ Under opaque pointers,
> we are currently representing constant offset GEPs as i8 GEPs, because this makes things
> a lot simpler -- with opaque pointers type information tends to not be preserved naturally,
> and you have to go out of your way, and in some cases use heuristics, to preserve type information.
> And apart from `inrange`, doing so is entirely wasted effort.
> For example, SROA will just generate i8 GEPs instead of doing a complicated dance trying to find a minimal natural-looking GEP.

Note that i'm //mostly// interested in annotating the outermost variable GEP,
since that is what will limit the alloca splitting. Constant indices are obvious,
and any indices after the first variable indice won't be relevant.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114988/new/

https://reviews.llvm.org/D114988