[PATCH] D93820: [InstSimplify] Don't fold gep p, -p to null

Nuno Lopes via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Dec 26 11:21:14 PST 2020


nlopes added a comment.

In D93820#2471805 <https://reviews.llvm.org/D93820#2471805>, @nikic wrote:

> In D93820#2471788 <https://reviews.llvm.org/D93820#2471788>, @nlopes wrote:
>
>> In D93820#2471778 <https://reviews.llvm.org/D93820#2471778>, @nikic wrote:
>>
>>> In D93820#2471737 <https://reviews.llvm.org/D93820#2471737>, @nlopes wrote:
>>>
>>>> I only looked at the tests and they were correct before, see here: https://alive2.llvm.org/ce/z/UzW3pv
>>>> The tests are weird because they have 'gep inbounds'. The reason they are correct (and weird) is that the only way `p - (int)p/sizeof(*p)` is inbounds is p being null. Anything else will overflow.
>>>
>>> This doesn't look right to me, at least not given current LangRef wording. Lets say we have `gep inbounds p, -p`, where `p = ptr(base_addr = 1, offset = -1)`. This means that the address value of `p` is `0`, but it has provenance of the object at `base_addr = 1`. As such, the `inbounds` is not violated (both `p` and the gep results are inbounds of the zero address), but we still change provenance.
>>
>> There's an extra catch: `gep inbounds` requires both the input and output pointers to be in bounds. This part is explicit in LangRef, at least.
>> Some examples:
>>
>>   p = malloc()
>>   q = gep inbounds p, -1  // poison
>>   r = gep p, -1           // ok
>>   s = gep inbounds r, 1   // poison: r is not inbounds
>>   t = gep r, 1            // ok, offset = 0
>>   u = gep inbounds t, 1   // ok, offset = 1 (assuming malloc size > 0)
>
> Right, but inbounds and provenance are, as far as I can tell, orthogonal concepts. Alive claims that this code has UB due to use of gep inbounds: https://alive2.llvm.org/ce/z/zTctIR At the same time, the gep inbounds itself is not poison: https://alive2.llvm.org/ce/z/wxGGyu That makes it looks like Alive also constrains provenance based on gep inbounds, not just the value of the pointer.

Your example just needs reasoning about inbounds. Let's see:

  %p.int = ptrtoint i8* %p to i64
  %p.neg = sub i64 0, %p.int
  %p.null = getelementptr i8, i8* %p, i64 %p.neg
  %p.null2 = getelementptr inbounds i8, i8* %p.null, i64 %p.null.neg

gep inbounds requires %p.null to be inbounds, otherwise the result is poison. For %p.null to be inbounds, we need the following (assuming %p has a positive offset to start with):

- %p.neg but be >= 0
- therefore %p.int <= 0
- therefore %p=null

He have established that %p.null2 is either null if %p=null, or poison otherwise. As both can be replaced with null, your example transformation is correct.

If %p was OOB, then its offset might have been negative, for example. But the reasoning is similar to what I wrote above.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93820/new/

https://reviews.llvm.org/D93820



More information about the llvm-commits mailing list