[PATCH] D81886: [AMDGPU] Add gfx1030 target

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 15 15:29:57 PDT 2020


rampitec marked 7 inline comments as done.
rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:396
 
+def FeatureGFX10_BEncoding : SubtargetFeature<"gfx10_b-encoding",
+  "GFX10_BEncoding",
----------------
arsenm wrote:
> We don't use underscores in any other subtarget feature name. change to -?
OK, in a follow-up patch.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:500
 
+def FeatureGetWaveIdInst : SubtargetFeature<"get-wave-id-inst",
+  "HasGetWaveIdInst",
----------------
arsenm wrote:
> Does this need its own feature apart from the instruction set?
Yes, it has been dropped.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:512
+
+def FeatureMadMacF32Insts : SubtargetFeature<"mad-mac-f32-insts",
+  "HasMadMacF32Insts",
----------------
arsenm wrote:
> mad-f32-insts may be a sufficient name?
OK, in a follow-up patch.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:926-927
   // float fr = mad(fqneg, fb, fa);
-  Value *FR = Builder.CreateIntrinsic(Intrinsic::amdgcn_fmad_ftz,
+  auto FMAD = !ST->hasMadMacF32Insts()
+                  ? Intrinsic::fma
+                  : (Intrinsic::ID)Intrinsic::amdgcn_fmad_ftz;
----------------
arsenm wrote:
> I prefer to use positive conditions and swap the select operand order
OK, in a follow-up patch.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1139
+                 MachineMemOperand::MODereferenceable |
+                 MachineMemOperand::MOVolatile;
+    return true;
----------------
arsenm wrote:
> Volatile should not be necessary (at least not unconditionally)
We add it for other atomics here.


================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:799-800
 let mayLoad = 1 in {
+// s_getreg_b32 should use hasSideEffects = 1 for tablegen to allow
+// its use in the readcyclecounter selection.
 def S_GETREG_B32 : SOPK_Pseudo <
----------------
arsenm wrote:
> arsenm wrote:
> > This is a problem. ReadCycleCounter should probably be marked IntrNoMem. I've also been thinking we need to introduce memory free versions of setreg pseudos that only touch the mode register
> s_getreg should definitely not have side effects. it's already a problem that it is mayLoad. We don't get CSE of addrspacecast apertures because of this. Is marking readcyclecounter IntrInaccessibleMemOnly enough to fix this?
Well, since it now reads timer... We can try to figure it out in subsequent patches.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll:3119
 ; CGP-NEXT:    v_trunc_f32_e32 v4, v4
-; CGP-NEXT:    v_mad_f32 v2, -v4, v3, v2
+; CGP-NEXT:    v_fma_f32 v2, -v4, v3, v2
 ; CGP-NEXT:    v_cvt_u32_f32_e32 v4, v4
----------------
arsenm wrote:
> This shouldn't have changed
This is changed because there is no -mcpu in the test.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81886/new/

https://reviews.llvm.org/D81886





More information about the llvm-commits mailing list