[PATCH] D155055: [WIP][AMDGPU] Divergence-driven instruction selection for fshr

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 27 09:22:13 PDT 2023


foad added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/bf16.ll:631
 ; GCN-NEXT:    s_mov_b32 s6, 0
-; GCN-NEXT:    v_alignbit_b32 v0, v1, v0, 16
+; GCN-NEXT:    v_or_b32_e32 v0, v0, v1
 ; GCN-NEXT:    s_mov_b32 s7, 0xf000
----------------
arsenm wrote:
> foad wrote:
> > foad wrote:
> > > There are lots of minor regressions like this. I will investigate.
> > The problem here is that really need to combine shifts and ORs into fshr post-legalization. This no longer happens automatically because we have marked fshr as Custom instead of Legal. I could do it with a target-specific OR combine, but I can't find any way to call back into helper code like MatchRotate in the generic DAGCombiner from a target-specific combine.
> is there just an isLegal call that needs to be isLegalOrCustom?
> is there just an isLegal call that needs to be isLegalOrCustom?

I can make that change in `DAGCombiner::MatchRotate` but it does not work:
```
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -8339,10 +8339,10 @@ SDValue DAGCombiner::MatchRotate(SDValue LHS, SDValue RHS, const SDLoc &DL) {
   // The target must have at least one rotate/funnel flavor.
   // We still try to match rotate by constant pre-legalization.
   // TODO: Support pre-legalization funnel-shift by constant.
-  bool HasROTL = hasOperation(ISD::ROTL, VT);
-  bool HasROTR = hasOperation(ISD::ROTR, VT);
-  bool HasFSHL = hasOperation(ISD::FSHL, VT);
-  bool HasFSHR = hasOperation(ISD::FSHR, VT);
+  bool HasROTL = TLI.isOperationLegalOrCustom(ISD::ROTL, VT);
+  bool HasROTR = TLI.isOperationLegalOrCustom(ISD::ROTR, VT);
+  bool HasFSHL = TLI.isOperationLegalOrCustom(ISD::FSHL, VT);
+  bool HasFSHR = TLI.isOperationLegalOrCustom(ISD::FSHR, VT);
 
   // If the type is going to be promoted and the target has enabled custom
   // lowering for rotate, allow matching rotate by non-constants. Only allow
```
The problem is that during legalization a uniform fshr will be legalized by lowering it to shifts and ORs, but this combine will immediately kick in and combine it back into a fshr. That causes an infinite loop.

Maybe the whole premise of this patch is flawed? Is it OK to say that fshr is only legal if it is divergent? Or do I have to say fshr is always legal, and then lower uniform fshr back into shift and ORs at some later stage?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155055/new/

https://reviews.llvm.org/D155055



More information about the llvm-commits mailing list