[PATCH] D119216: [AMDGPU] replace hostcall module flag with function attribute

Sameer Sahasrabuddhe via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 9 03:07:44 PST 2022


sameerds added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:566
+      return false;
+    };
+
----------------
sameerds wrote:
> jdoerfert wrote:
> > sameerds wrote:
> > > jdoerfert wrote:
> > > > sameerds wrote:
> > > > > jdoerfert wrote:
> > > > > > jdoerfert wrote:
> > > > > > > sameerds wrote:
> > > > > > > > jdoerfert wrote:
> > > > > > > > > You can use AAPointerInfo for the call site return IRPosition. It will (through the iterations) gather all accesses and put them into "bins" based on offset and size. It deals with uses in calls, etc. and if there is stuff missing it is better to add it in one place so we benefit throughout. 
> > > > > > > > I am not following what you have in mind. "implicitarg_ptr" is a pointer returned by an intrinsic that reads an ABI-defined register. I need to check that for a given call-graph, a particular range of bytes relative to that base pointer are never accessed. The above DFS on the uses conservatively assumes that such a load exists unless it can conclusively trace every use of the base pointer. This may include the pointer being passed to an extern function or being stored into a different memory location (although we don't expect ABI registers being capture this way). I am not seeing how to construct this around AAPointerInfo. As far as I can see, the public interface only talks about uses that are recognized as loads and stores.
> > > > > > > Not actually tested, replaces the function body. Depends on D119249.
> > > > > > > ```
> > > > > > > const auto PointerInfoAA = A.getAAFor<AAPointerInfo>(*this, IRPosition::callback_returned(cast<CallBase>(Ptr)), DepClassTy::Required);
> > > > > > > if (!PointerInfoAA.getState().isValidState())
> > > > > > >   return true; // Abort (which is weird as false is abort in the other CB).
> > > > > > > AAPointerInfo::OffsetAndSize OAS(*Position, /* probably look pointer width up in DL */ 8);
> > > > > > > return !forallInterferingAccesses(OAS, [](const AAPointerInfo::Access &Acc, bool IsExact) {
> > > > > > >    return Acc.getRemoteInst()->isDroppable(); });
> > > > > > > ```
> > > > > > You don't actually need the state check.
> > > > > > And as I said, this will take care of following pointers passed into callees or through memory to other places, all while ignoring dead code, etc.
> > > > > I see now. forallInterferingAccesses() does check for valid state on entry, which is sufficient to take care of all the opaque uses like a call to an extern function or a complicated phi or a capturing store. Thanks a ton ... this has been very educational!
> > > > Yes, all "forAll" functions will return `false` if we cannot visit "all". Though, these methods will utilize the smarts, e.g., ignore what is dead, look at loads if the value is stored in a way we can track it through memory, transfer accesses in a callee to the caller "space" if a pointer is passed to the callee,... etc.
> > > > Yes, all "forAll" functions will return `false` if we cannot visit "all". Though, these methods will utilize the smarts, e.g., ignore what is dead, look at loads if the value is stored in a way we can track it through memory, transfer accesses in a callee to the caller "space" if a pointer is passed to the callee,... etc.
> > > 
> > > @jdoerfert, do you see  D119249 landing soon? We are kinda on a short runway here (less than a handful of days) and hoping to land this review quickly because it solves an important issue. I would prefer to have the check that you outlined, but the alternative is to let my version through now, and then update it when the new interface becomes available.
> > Did you test my proposed code? I'll land my patch tomorrow, if yours works as you expect with the proposed AAPointerInfo use, great. If it doesn't I don't necessarily mind you merging something else with a clear intention to address issues and remove duplication as we go.
> > 
> > I can lift my commit block as we go and @arsenm can also give it a good-to-go if need be.
> Lifting your commit block would be useful in general. I certainly do not intend to submit something in a hurry.
> 
> I applied your change and tested the proposed code. It gives more pessimistic results than my original crude version, i.e., it marks some uses of the `implicitarg_ptr` as accessing the target range, when I am pretty sure it does not. See `test_kernel42` in the lit test `hsa-metadata-hostcall-v3.ll` in this review. It is intended to access eight bytes from 16 onwards, which clearly does not overlap with offset 24, which is the start of the target range (hostcall pointer).
> 
> Strangely, in the debug messages, the `AAPointerInfoCallsiteReturned()` constructor is called, but I don't see any messages marked [AAPointerInfo].
Good news. The PointerInfo works as expected ... it's just that AMDGPUAttributor has a very short Allowed list, and I needed to add PointerInfo there.

So @jdoerfert, D119249 works for me with the suggested changes. If that makes it to mainline, I'll update this change to use it.

@arsenm do you see any issues with enabling AAPointerInfo in the AMDGPUAttributor? Will there be concerns about it being "too heavy" or something?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119216/new/

https://reviews.llvm.org/D119216



More information about the cfe-commits mailing list