[PATCH] D135447: [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 16 14:54:21 PST 2022


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:314
+
+  setOperationAction(
+      ISD::IS_FPCLASS,
----------------
Can you add a fixme that we just want scalarization?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:870
+
+bool AMDGPUInstructionSelector::selectG_IS_FPCLASS(MachineInstr &I) const {
+  MachineBasicBlock *BB = I.getParent();
----------------
JanekvO wrote:
> arsenm wrote:
> > I don't see why you need to manually select this (maybe sharing the pattern between the existing intrinsic is annoying because the new intrinsic uses immarg?)
> I did look on whether I could re-use some of the existing tablegen but I couldn't get it quite into the right shape for it to match. `llvm.is.fpclass` requires the mask to be an immarg as you mentioned so materializing the immediate into a register anywhere before this function results in a verifier error.
You might need to split it into a different pattern instantiation, but you would just need the S_MOV_B32 from the mask to the constant (although I actually would expect it to work if you directly folded the constant anyway, since the operand should have been copied to VGPR anyway). Something like:


```
class ClassPat<Instruction inst, ValueType vt> : GCNPat <
  (fp_class (VOP3Mods vt:$src0, i32:$src0_mods), (i32 timm:mask))
  (inst $src0_mods, VSrc_b32:$src0, $src0_mods, (S_MOV_B32 $mask))
>;
```


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:983
+    .widenScalarToNextPow2(1)
+    .clampScalar(1, S32, S64)
+    .scalarize(0);
----------------
I think this clampScalar isn't doing anything and can be dropped


================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll:11
+; RUN:  llc -global-isel=1 -march=amdgcn -mcpu=gfx1100 -verify-machineinstrs < %s | FileCheck --check-prefix=GFX11CHECK %s
+
+define i1 @isnan_float(float %x) nounwind {
----------------
Can you also add some cases where the input will be an SGPR?


================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll:106
+
+define <3 x i1> @isnan_v3float(<3 x float> %x) nounwind {
+; GFX7CHECK-LABEL: isnan_v3float:
----------------
s/float/f32 in these function names 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135447/new/

https://reviews.llvm.org/D135447



More information about the llvm-commits mailing list