[llvm] RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically (PR #87509)

Sun Apr 7 21:16:33 PDT 2024

ssahasra wrote:

> I appreciate that you've made it Just Work for all intrinsics, but your approach "merely" handles intrinsics and I'd like to make it work for other convergent SDNodes too, while keeping friction low.

Yeah, that's a good point. Are these two overlapping sets? The set of intrinsics in MIR and the set of equivalent instructions. Is there a point in the lowering where both can be present, and convergence tokens have not disappeared yet?

> I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

> The first thing is adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually create the TargetOpcode::CONVERGENCECTRL_GLUE node. I really don't think this is controversial, based on the argument that all type of glue are handled the same way.

Agreed.

> The second thing is adding an explicit SDNPOptInGlue to convergent intrinsic definitions so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually copy the glue operand. Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that _all_ convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

I am very much in favour for not having special C++ code for something that can happen automatically. The added assert is a bonus. I would first go for a TableGen static assert. At this point, the idea of gluing tokens is pretty much target independent, so the assert should also not be limited to AMDGPU.

https://github.com/llvm/llvm-project/pull/87509