[PATCH] D101847: [AMDGPU] Fix function pointer argument bug in AMDGPU Propagate Attributes pass.

Jacob Weightman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 6 13:27:55 PDT 2021


jweightman added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPropagateAttributes.cpp:252
         CallBase *CI = dyn_cast<CallBase>(I);
-        if (!CI)
+        if (!CI || CI->getCalledFunction() != &F)
           continue;
----------------
arsenm wrote:
> madhur13490 wrote:
> > madhur13490 wrote:
> > > arsenm wrote:
> > > > jweightman wrote:
> > > > > arsenm wrote:
> > > > > > jweightman wrote:
> > > > > > > arsenm wrote:
> > > > > > > > What if the callee was a bitcast of the function? Can you add another testcase with a constexpr function pointer cast?
> > > > > > > I tried adding this to the LIT test:
> > > > > > > ```
> > > > > > >     %h_cast = bitcast i32* ()* @h to i8* ()*
> > > > > > >     %1 = call i8* %h_cast()
> > > > > > > ...
> > > > > > > define private i32* @h() #1 {
> > > > > > >     %1 = alloca i32
> > > > > > >     ret i32* %1
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > and the pass does not propagate the attributes like one would probably want. However, this seems pretty orthogonal to the problem I was trying to fix (preventing the illegal callee change), because the "user" is an `llvm::BitCastInstr` so `CI` is always `nullptr` in that case. 
> > > > > > > 
> > > > > > > In order to handle bitcasts, I think we would need new logic to detect which bitcasted functions need to be replaced, a new data structure to track them in (similar to the existing `ToReplace`), and new logic to do that replacement — all of which would be on a separate logical branch from the existing code for handling call instructions! Unless you see an easier way, I think such a change is beyond the scope of what I was trying to do.
> > > > > > This example isn't a constant expression cast. If you run opt -S -instcombine on this, it should turn it into the constant expr cast form. In that case the user will still be CallInst, it's just CI->getCalledFunction() will be a constantexpr cast of the function. stripPointerCasts should get you the base function
> > > > > I see. I misunderstood your initial comment. This seems like a reasonable thing to try to support with this diff.
> > > > > 
> > > > > The instcombine pass seems to prefer to cast the argument/return value (I modified the test to try both) rather than the function pointer itself. Going back to the IR Language Reference, I saw the constant expression section (I'm still new at this...), so I tried this IR:
> > > > > ```
> > > > > %1 = alloca i8
> > > > > %2 = call i8* bitcast (i32* ()* @h to i8* ()*)()
> > > > > ```
> > > > > However, the Propagate Attributes pass doesn't produce a clone of `@h` in this situation. I see that the `dyn_cast<Instruction>(U)` fails because the user of `h` is just the constexpr bitcast:
> > > > > ```
> > > > > > U->dump()
> > > > > void (i8*)* bitcast (void (i32*)* @g to void (i8*)*)
> > > > > ```
> > > > > Sorry for the back and forth, but is there some way to find an enclosing `CallBase` if it exists? 
> > > > I don't think there's a super easy way to do it. You would have to find the context. The normal way to check that would be to start from the instruction and look into the constantexpr for the global, which starts looking like general indirect call handling.
> > > > 
> > > > Ultimately this pass is a grotesque hack that should absolutely not exist, and only does because comgr still does not link the device libraries correctly for opencl. How did you run into this?
> > > 
> > > > Ultimately this pass is a grotesque hack that should absolutely not exist, and only does because comgr still does not link the device libraries correctly for opencl. How did you run into this?
> > > 
> > > I don't think that should be the only reason. In case of indirect calls, we may want to clone functions based on conflicting register budget. So, this should exist but probably with some fixes.
> > > 
> > User of 'bitcast' operator is `CallBase`, so you will have to go one level up in user "chain". 
> But you don't know how many layers up you would need to go. You could potentially need to do an exhaustive search which touches a lot of the program
@arsenm to your question of how I ran into this: I ran into this issue while working on the Cray compiler. With the way we have things set up, it is possible to have indirect calls in the IR when this pass runs, though I'm not sure if it's cause the issue with Clang.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPropagateAttributes.cpp:252
         CallBase *CI = dyn_cast<CallBase>(I);
-        if (!CI)
+        if (!CI || CI->getCalledFunction() != &F)
           continue;
----------------
jweightman wrote:
> arsenm wrote:
> > madhur13490 wrote:
> > > madhur13490 wrote:
> > > > arsenm wrote:
> > > > > jweightman wrote:
> > > > > > arsenm wrote:
> > > > > > > jweightman wrote:
> > > > > > > > arsenm wrote:
> > > > > > > > > What if the callee was a bitcast of the function? Can you add another testcase with a constexpr function pointer cast?
> > > > > > > > I tried adding this to the LIT test:
> > > > > > > > ```
> > > > > > > >     %h_cast = bitcast i32* ()* @h to i8* ()*
> > > > > > > >     %1 = call i8* %h_cast()
> > > > > > > > ...
> > > > > > > > define private i32* @h() #1 {
> > > > > > > >     %1 = alloca i32
> > > > > > > >     ret i32* %1
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > and the pass does not propagate the attributes like one would probably want. However, this seems pretty orthogonal to the problem I was trying to fix (preventing the illegal callee change), because the "user" is an `llvm::BitCastInstr` so `CI` is always `nullptr` in that case. 
> > > > > > > > 
> > > > > > > > In order to handle bitcasts, I think we would need new logic to detect which bitcasted functions need to be replaced, a new data structure to track them in (similar to the existing `ToReplace`), and new logic to do that replacement — all of which would be on a separate logical branch from the existing code for handling call instructions! Unless you see an easier way, I think such a change is beyond the scope of what I was trying to do.
> > > > > > > This example isn't a constant expression cast. If you run opt -S -instcombine on this, it should turn it into the constant expr cast form. In that case the user will still be CallInst, it's just CI->getCalledFunction() will be a constantexpr cast of the function. stripPointerCasts should get you the base function
> > > > > > I see. I misunderstood your initial comment. This seems like a reasonable thing to try to support with this diff.
> > > > > > 
> > > > > > The instcombine pass seems to prefer to cast the argument/return value (I modified the test to try both) rather than the function pointer itself. Going back to the IR Language Reference, I saw the constant expression section (I'm still new at this...), so I tried this IR:
> > > > > > ```
> > > > > > %1 = alloca i8
> > > > > > %2 = call i8* bitcast (i32* ()* @h to i8* ()*)()
> > > > > > ```
> > > > > > However, the Propagate Attributes pass doesn't produce a clone of `@h` in this situation. I see that the `dyn_cast<Instruction>(U)` fails because the user of `h` is just the constexpr bitcast:
> > > > > > ```
> > > > > > > U->dump()
> > > > > > void (i8*)* bitcast (void (i32*)* @g to void (i8*)*)
> > > > > > ```
> > > > > > Sorry for the back and forth, but is there some way to find an enclosing `CallBase` if it exists? 
> > > > > I don't think there's a super easy way to do it. You would have to find the context. The normal way to check that would be to start from the instruction and look into the constantexpr for the global, which starts looking like general indirect call handling.
> > > > > 
> > > > > Ultimately this pass is a grotesque hack that should absolutely not exist, and only does because comgr still does not link the device libraries correctly for opencl. How did you run into this?
> > > > 
> > > > > Ultimately this pass is a grotesque hack that should absolutely not exist, and only does because comgr still does not link the device libraries correctly for opencl. How did you run into this?
> > > > 
> > > > I don't think that should be the only reason. In case of indirect calls, we may want to clone functions based on conflicting register budget. So, this should exist but probably with some fixes.
> > > > 
> > > User of 'bitcast' operator is `CallBase`, so you will have to go one level up in user "chain". 
> > But you don't know how many layers up you would need to go. You could potentially need to do an exhaustive search which touches a lot of the program
> @arsenm to your question of how I ran into this: I ran into this issue while working on the Cray compiler. With the way we have things set up, it is possible to have indirect calls in the IR when this pass runs, though I'm not sure if it's cause the issue with Clang.
See the loop that I added above. Hopefully that isn't too invasive, though it's possibly not the most robust way of doing it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101847/new/

https://reviews.llvm.org/D101847



More information about the llvm-commits mailing list