[PATCH] D90708: [LangRef] Clarify GEP inbounds wrapping semantics

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 12 09:46:52 PST 2020


nikic marked an inline comment as done.
nikic added inline comments.


================
Comment at: llvm/docs/LangRef.rst:9782
+   means that it points into an allocated object, or to its end (which is one
+   byte past the last byte contained in the object). The only *in bounds*
+   address for a null pointer in the default address-space is the null pointer
----------------
jrtc27 wrote:
> nlopes wrote:
> > jrtc27 wrote:
> > > nikic wrote:
> > > > nlopes wrote:
> > > > > I still don't like the current writing. I would need to see some evidence from language standards that they require pointers past the end of objects.
> > > > What would be a better wording? "One past the end" is a term of art, and as such should be well understood: https://www.google.com/search?q=one+past+the+end
> > > > If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
> > > 
> > > https://port70.net/~nsz/c/c11/n1570.html#6.5.6p8
> > Thanks for the reference. Though that paragraph doesn't say that a pointer 1 byte past the end is valid.
> > It says that the following is valid:
> > int x[n]
> > q = p+(n-1); // points to the last element
> > q = p+1; // points to one element past the last
> > 
> > Doesn't say that `(char*)(p+n)+1` is valid, which is what it means for a pointer 1 byte past the end to be valid.
> > 
> > So AFAICT, both the C & C++ standards agree that p+n is the max one needs to support.
> > 
> > My suggestion is simply to remove the part in parenthesis "(which is one byte past the last byte contained in the object)". Or replace it with similar wording of the C++ standard (corresponds to a hypothetical next element or something like that).
> Assuming
> > ```q = p+1; /* points to one element past the last */```
> was meant to be
> > ```q = p+n; /* points to one element past the last */```
> 
> 
> It depends whether you define end as being `p + n` or `(char *)(p + n) - 1`. C/C++ use the latter (as do people when they talk about "one past the end" pointers), whereas you seem to be using the former. To C/C++, `(char*)(p+n)+1` would be OOB as it's one byte after one past the last element.
> 
> So I think we are on the same page in terms of semantics, we just have different ideas of what certain terms mean.
Okay, I've dropped the part in the parentheses.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90708/new/

https://reviews.llvm.org/D90708



More information about the llvm-commits mailing list