[PATCH] D93820: [InstSimplify] Don't fold gep p, -p to null
Nuno Lopes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Dec 26 11:21:14 PST 2020
nlopes added a comment.
In D93820#2471805 <https://reviews.llvm.org/D93820#2471805>, @nikic wrote:
> In D93820#2471788 <https://reviews.llvm.org/D93820#2471788>, @nlopes wrote:
>
>> In D93820#2471778 <https://reviews.llvm.org/D93820#2471778>, @nikic wrote:
>>
>>> In D93820#2471737 <https://reviews.llvm.org/D93820#2471737>, @nlopes wrote:
>>>
>>>> I only looked at the tests and they were correct before, see here: https://alive2.llvm.org/ce/z/UzW3pv
>>>> The tests are weird because they have 'gep inbounds'. The reason they are correct (and weird) is that the only way `p - (int)p/sizeof(*p)` is inbounds is p being null. Anything else will overflow.
>>>
>>> This doesn't look right to me, at least not given current LangRef wording. Lets say we have `gep inbounds p, -p`, where `p = ptr(base_addr = 1, offset = -1)`. This means that the address value of `p` is `0`, but it has provenance of the object at `base_addr = 1`. As such, the `inbounds` is not violated (both `p` and the gep results are inbounds of the zero address), but we still change provenance.
>>
>> There's an extra catch: `gep inbounds` requires both the input and output pointers to be in bounds. This part is explicit in LangRef, at least.
>> Some examples:
>>
>> p = malloc()
>> q = gep inbounds p, -1 // poison
>> r = gep p, -1 // ok
>> s = gep inbounds r, 1 // poison: r is not inbounds
>> t = gep r, 1 // ok, offset = 0
>> u = gep inbounds t, 1 // ok, offset = 1 (assuming malloc size > 0)
>
> Right, but inbounds and provenance are, as far as I can tell, orthogonal concepts. Alive claims that this code has UB due to use of gep inbounds: https://alive2.llvm.org/ce/z/zTctIR At the same time, the gep inbounds itself is not poison: https://alive2.llvm.org/ce/z/wxGGyu That makes it looks like Alive also constrains provenance based on gep inbounds, not just the value of the pointer.
Your example just needs reasoning about inbounds. Let's see:
%p.int = ptrtoint i8* %p to i64
%p.neg = sub i64 0, %p.int
%p.null = getelementptr i8, i8* %p, i64 %p.neg
%p.null2 = getelementptr inbounds i8, i8* %p.null, i64 %p.null.neg
gep inbounds requires %p.null to be inbounds, otherwise the result is poison. For %p.null to be inbounds, we need the following (assuming %p has a positive offset to start with):
- %p.neg but be >= 0
- therefore %p.int <= 0
- therefore %p=null
He have established that %p.null2 is either null if %p=null, or poison otherwise. As both can be replaced with null, your example transformation is correct.
If %p was OOB, then its offset might have been negative, for example. But the reasoning is similar to what I wrote above.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93820/new/
https://reviews.llvm.org/D93820
More information about the llvm-commits
mailing list