[PATCH] D51203: AMDGPU: Handle 32-bit address wraparounds for SMRD opcodes

Wed Aug 29 06:39:22 PDT 2018

nhaehnle accepted this revision.
nhaehnle added a comment.
This revision is now accepted and ready to land.

In https://reviews.llvm.org/D51203#1216951, @mareko wrote:

> In https://reviews.llvm.org/D51203#1214382, @nhaehnle wrote:
>
> > Looking at SelectionDAGBuilder::visitGetElementPtr, nuw is set under certain conditions for `inbounds` getelementptr. I suspect we should be able to make most GEPs inbounds in Mesa - it just means that we never, not even temporarily, try to take addresses outside of properly allocated memory objects (buffers, arrays of descriptors).
> >
> > Would the combination of:
> >
> > - check NUW here
> > - create inbounds GEP make good use of SMEM/SMRD immediates?
>
>
> The code in SelectionDAGBuilder::visitGetElementPtr is unsafe for our case, because it doesn't know that the addition of 32-bit addresses is performed in 64 bits. Even x+4 can overflow in 32 bits but not 64 bits. The GEP can be "inbounds", but it doesn't change anything. However, we can hackishly use the inbounds flag to mean that the addition is safe, because GEP with inbounds and offset <= INT_MAX is converted to "add nuw".

Yes, x+4 can overflow, but 'inbounds' is basically a promise that that won't happen. I'm pretty sure there's nothing hackish or unsafe about it.

First, 'inbounds' on getelementptr means (according to LangRef): interpreting all array offsets as signed integers and performing all arithmetic with infinite precision (i.e., no wraparound), the resulting pointer stays in bounds of the object pointed to by the base pointer at each step.

Now, visitGetElementPtr translates inbounds getelementptr on addrspace(6) with a non-negative array index into an `add nuw i32`. This is correct, because memory objects themselves never wrap around (there is no memory object that starts below 2^32 and ends above 0) and so the in bounds promise means that there is indeed no unsigned wrapping when adding as 32-bits.

Second, when we encounter (load (add nuw i32 reg, const)), and we put the const into the SMRD, then yes, the addition will be performed in 64 bits. But that doesn't change the wrapping behavior because there was no wrapping to begin with.

I think this change is good to go as-is.

Repository:
  rL LLVM

https://reviews.llvm.org/D51203