[llvm-dev] [InstCombine] Addrspacecast and GEP assumed commutative

David Chisnall via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 23 10:50:22 PDT 2020


Hi,

I missed this thread initially, but we hit this issue in CHERI a year or 
so ago and have disabled the relevant bit of InstCombine in our 
downstream fork.  In CHERI, AS200 is used to represent memory 
capabilities (hardware-enforced fat pointers that do bounds and 
permission checks), AS0 is used to represent 64-bit (or 32-bit) integer 
virtual addresses that are implicitly offset and bounds checked against 
a default capability.  An address space cast transforms between these 
two representations.  If an AS200 pointer points to an object that is 
not fully within the default data capability then the AS200->AS0 cast 
will give null.  Any AS0 -> AS200 cast will give the rights to the 
entire default data capability (which, in a pure-capability ABI program, 
may be no rights at all!).  There are a number of corner cases where 
this transform is not sound.

There are a number of places in LLVM that assume AS cast and bitcast are 
equivalent.  This is unfortunate because, if they were, there would be 
no point in AS cast existing).  The semantics of AS cast (or ASs in 
general) are not particularly well defined.  Matthew Arsenault has been 
doing a great job trying to pin them down.

David

On 23/03/2020 14:52, Thomas Faingnaert via llvm-dev wrote:
> I'm not sure what the usual "ping time" is for llvm-dev, but may I ask 
> if there are any updates on this?
> 
> It appears that the following lines are the root cause of the reordering 
> (https://github.com/llvm/llvm-project/blob/fdcb27105537f77c78c4473d4f7c47146ddbab69/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp#L2175):
> 
> //Handle gep(bitcast x) and gep(gep x, 0, 0, 0).
> 
> Value *StrippedPtr = PtrOp->stripPointerCasts();
> 
> PointerType *StrippedPtrTy = cast<PointerType>(StrippedPtr->getType());
> 
> 
> Namely, the stripPointerCastsfunction also strips address spaces.
> Would replacing this by stripPointerCastsSameRepresentationbe an 
> acceptable solution, or do other parts of LLVM rely on this reordering?
> If not, I can send a patch in Phabricator to address this issue.
> 
> Kind regards,
> Thomas Faingnaert
> 
> ------------------------------------------------------------------------
> *Van:* Johannes Doerfert
> *Verzonden:* Zaterdag, 14 Maart, 2020 6:50
> *Aan:* Thomas Faingnaert
> *CC:* llvm-dev at lists.llvm.org; Matthew Arsenault
> *Onderwerp:* Re: [llvm-dev] [InstCombine] Addrspacecast and GEP assumed 
> commutative
> 
> On 03/13, Thomas Faingnaert via llvm-dev wrote:
>> I have a question regarding the InstCombine pass.
>> Given the following IR:
>> 
>> define void @test(i64) {
>>   %ptr = inttoptr i64 %0 to i16*
>> 
>>   %asc = addrspacecast i16* %ptr to i16 addrspace(1)*
>>   %gep = getelementptr i16, i16 addrspace(1)* %asc, i64 16
>>   store i16 0, i16 addrspace(1)* %gep, align 16
>> 
>>   ret void
>> }
>> 
>> opt -instcombine transforms this to:
>> 
>> define void @test(i64 %0) {
>>   %ptr = inttoptr i64 %0 to i16*
>> 
>>   %gep1 = getelementptr i16, i16* %ptr, i64 16
>>   %gep = addrspacecast i16* %gep1 to i16 addrspace(1)*
>>   store i16 0, i16 addrspace(1)* %gep, align 16
>> 
>>   ret void
>> }
>> 
>> As you can see, the order of the GEP and addrspacecast is changed.
>> Is this the expected behaviour? Normally, ASC(GEP(x)) is not necessarily equivalent to GEP(ASC(x)), e.g. when the GEP moves past the end of one address space.
> 
> Yeah, I can see this being a problem. I think it is valid if the gep is
> inbounds but not otherwise. Matt, wdyt?
> 
> 
>> I stumbled upon this because this reordering actually gives problems for instruction selection in the NVPTX backend.
>> Without reordering, the above IR gets lowered to:
>> 
>> ld.param.u64 %rd1, [test_param_0];
>> cvta.to.global.u64 %rd2, %rd1;
>> mov.u16 %rs1, 0;
>> st.global.u16 [%rd2+32], %rs1;
>> ret;
>> 
>> But with the reordering, the backend instead emits this:
>> 
>> ld.param.u64 %rd1, [test_param_0];
>> add.s64 %rd2, %rd1, 32;
>> cvta.to.global.u64 %rd3, %rd2;
>> mov.u16 %rs1, 0;
>> st.global.u16 [%rd3], %rs1;
>> ret;
>> 
>> i.e. an explicit add instruction instead of folding the addition in the addressing mode of the store.
>> 
>> My question is twofold:
>> 
>>   1.  Is this reordering of GEP and ASC in the InstCombine pass the expected behaviour?
>>   2.  At the moment, I solved the issue described above by reordering the GEP and ASC during ISEL of NVPTX (https://reviews.llvm.org/D75817), but I don't check if the reordering is 
> valid. If the reordering in InstCombine is indeed the expected 
> behaviour, would it not be better to disable it? I imagine that 
> verifying if the necessary conditions hold for reordering during ISEL 
> will be quite complex.
> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> -- 
> 
> Johannes Doerfert
> Researcher
> 
> Argonne National Laboratory
> Lemont, IL 60439, USA
> 
> jdoerfert at anl.gov
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 



More information about the llvm-dev mailing list