[PATCH] D81886: [AMDGPU] Add gfx1030 target
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 15 15:29:57 PDT 2020
rampitec marked 7 inline comments as done.
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:396
+def FeatureGFX10_BEncoding : SubtargetFeature<"gfx10_b-encoding",
+ "GFX10_BEncoding",
----------------
arsenm wrote:
> We don't use underscores in any other subtarget feature name. change to -?
OK, in a follow-up patch.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:500
+def FeatureGetWaveIdInst : SubtargetFeature<"get-wave-id-inst",
+ "HasGetWaveIdInst",
----------------
arsenm wrote:
> Does this need its own feature apart from the instruction set?
Yes, it has been dropped.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:512
+
+def FeatureMadMacF32Insts : SubtargetFeature<"mad-mac-f32-insts",
+ "HasMadMacF32Insts",
----------------
arsenm wrote:
> mad-f32-insts may be a sufficient name?
OK, in a follow-up patch.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:926-927
// float fr = mad(fqneg, fb, fa);
- Value *FR = Builder.CreateIntrinsic(Intrinsic::amdgcn_fmad_ftz,
+ auto FMAD = !ST->hasMadMacF32Insts()
+ ? Intrinsic::fma
+ : (Intrinsic::ID)Intrinsic::amdgcn_fmad_ftz;
----------------
arsenm wrote:
> I prefer to use positive conditions and swap the select operand order
OK, in a follow-up patch.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1139
+ MachineMemOperand::MODereferenceable |
+ MachineMemOperand::MOVolatile;
+ return true;
----------------
arsenm wrote:
> Volatile should not be necessary (at least not unconditionally)
We add it for other atomics here.
================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:799-800
let mayLoad = 1 in {
+// s_getreg_b32 should use hasSideEffects = 1 for tablegen to allow
+// its use in the readcyclecounter selection.
def S_GETREG_B32 : SOPK_Pseudo <
----------------
arsenm wrote:
> arsenm wrote:
> > This is a problem. ReadCycleCounter should probably be marked IntrNoMem. I've also been thinking we need to introduce memory free versions of setreg pseudos that only touch the mode register
> s_getreg should definitely not have side effects. it's already a problem that it is mayLoad. We don't get CSE of addrspacecast apertures because of this. Is marking readcyclecounter IntrInaccessibleMemOnly enough to fix this?
Well, since it now reads timer... We can try to figure it out in subsequent patches.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll:3119
; CGP-NEXT: v_trunc_f32_e32 v4, v4
-; CGP-NEXT: v_mad_f32 v2, -v4, v3, v2
+; CGP-NEXT: v_fma_f32 v2, -v4, v3, v2
; CGP-NEXT: v_cvt_u32_f32_e32 v4, v4
----------------
arsenm wrote:
> This shouldn't have changed
This is changed because there is no -mcpu in the test.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81886/new/
https://reviews.llvm.org/D81886
More information about the llvm-commits
mailing list