[PATCH] D104997: [WIP][AMDGPU] Deduce attributes with the Attributor

Sat Jul 10 16:28:00 PDT 2021

jdoerfert added a comment.

Did you add the tests I suggested?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:233
+private:
+  Optional<bool> UniformWorkGroupSize = llvm::None;
+};
----------------
kuter wrote:
> kuter wrote:
> > jdoerfert wrote:
> > > Maybe you just use the boolean state instead? Having the additional boolean here doesn't seem to bring any benefit. You can probably just expose `clampStateAndIndicateChange` in the Attributor.h (`AA` namespace) and then use it in the `CheckCallSite` callback. There should not be anything to do but ask for the callee AA and clamp. This should remove 20 or so lines here.
> > I don't think this would work. In some cases we need to be able turn `uniform-work-group-size` attribute to false even if it is initialized with true.
> > 
> > in `uniform-work-group-attribute-missing.ll` `@foo` get's initialized with `uniform-work-group-size` true. But we turn it to false because the kernel doesn't have the attribute (for kernels we assume it is false if it is not present).
> > 
> > If we where to use the BooleanState, `@foo` would automatically be at a fixpoint on initialization (since it is set to true).
> > and we wouldn't be able to turn it to false.
> Also @jdoerfert There is now several attributes that use `BooleanState` as a "placeholder" state.
> I think it would be great if we had something like a `VoidState` to avoid confusion.
> even if it was as simple as `using VoidState = BooleanState;` 
Hm, but we use the boolean in those cases don't we? To distinguish "good" and "bad". We can talk offline.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:233
+private:
+  Optional<bool> UniformWorkGroupSize = llvm::None;
+};
----------------
jdoerfert wrote:
> kuter wrote:
> > kuter wrote:
> > > jdoerfert wrote:
> > > > Maybe you just use the boolean state instead? Having the additional boolean here doesn't seem to bring any benefit. You can probably just expose `clampStateAndIndicateChange` in the Attributor.h (`AA` namespace) and then use it in the `CheckCallSite` callback. There should not be anything to do but ask for the callee AA and clamp. This should remove 20 or so lines here.
> > > I don't think this would work. In some cases we need to be able turn `uniform-work-group-size` attribute to false even if it is initialized with true.
> > > 
> > > in `uniform-work-group-attribute-missing.ll` `@foo` get's initialized with `uniform-work-group-size` true. But we turn it to false because the kernel doesn't have the attribute (for kernels we assume it is false if it is not present).
> > > 
> > > If we where to use the BooleanState, `@foo` would automatically be at a fixpoint on initialization (since it is set to true).
> > > and we wouldn't be able to turn it to false.
> > Also @jdoerfert There is now several attributes that use `BooleanState` as a "placeholder" state.
> > I think it would be great if we had something like a `VoidState` to avoid confusion.
> > even if it was as simple as `using VoidState = BooleanState;` 
> Hm, but we use the boolean in those cases don't we? To distinguish "good" and "bad". We can talk offline.
> I don't think this would work. ..

It does, if you online finalize kernels in the initialization everything else should not even read the attribute from the IR but just start optimistic and then go on.

================
Comment at: llvm/test/CodeGen/AMDGPU/simple-indirect-call.ll:51

+;FIXME: The AMDGPU Attributor does not deduce the uniform-group-size attribute.
 ; attributes #0 = { "amdgpu-dispatch-id" "amdgpu-dispatch-ptr" "amdgpu-implicitarg-ptr" "amdgpu-work-group-id-x" "amdgpu-work-group-id-y" "amdgpu-work-group-id-z" "amdgpu-work-item-id-x" "amdgpu-work-item-id-y" "amdgpu-work-item-id-z" }
----------------
it doesnt?

================
Comment at: llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll:25
   ret void
 }

----------------
`weak_func` can be annotated with `"uniform-work-group-size"="true"` because there is no way it is called from a kernel with `"uniform-work-group-size"="false"`. The test is just too conservative here. Please don't keep the linkage check as it is not needed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104997/new/

https://reviews.llvm.org/D104997