[PATCH] D81886: [AMDGPU] Add gfx1030 target
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 15 15:29:50 PDT 2020
arsenm added inline comments.
================
Comment at: clang/lib/Basic/Targets/AMDGPU.cpp:174
+ case GK_GFX1030:
+ Features["ci-insts"] = true;
+ Features["dot1-insts"] = true;
----------------
Unrelated, but I think we should stop emitting the features in the target-features string if it's present on the base device
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:396
+def FeatureGFX10_BEncoding : SubtargetFeature<"gfx10_b-encoding",
+ "GFX10_BEncoding",
----------------
We don't use underscores in any other subtarget feature name. change to -?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:500
+def FeatureGetWaveIdInst : SubtargetFeature<"get-wave-id-inst",
+ "HasGetWaveIdInst",
----------------
Does this need its own feature apart from the instruction set?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:512
+
+def FeatureMadMacF32Insts : SubtargetFeature<"mad-mac-f32-insts",
+ "HasMadMacF32Insts",
----------------
mad-f32-insts may be a sufficient name?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:926-927
// float fr = mad(fqneg, fb, fa);
- Value *FR = Builder.CreateIntrinsic(Intrinsic::amdgcn_fmad_ftz,
+ auto FMAD = !ST->hasMadMacF32Insts()
+ ? Intrinsic::fma
+ : (Intrinsic::ID)Intrinsic::amdgcn_fmad_ftz;
----------------
I prefer to use positive conditions and swap the select operand order
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1139
+ MachineMemOperand::MODereferenceable |
+ MachineMemOperand::MOVolatile;
+ return true;
----------------
Volatile should not be necessary (at least not unconditionally)
================
Comment at: llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp:223
const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(MI.getOpcode());
- if (Info->MIMGEncoding != AMDGPU::MIMGEncGfx10NSA)
+ if (!Info || Info->MIMGEncoding != AMDGPU::MIMGEncGfx10NSA)
return;
----------------
It seems like a bug this would be missing?
================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:799-800
let mayLoad = 1 in {
+// s_getreg_b32 should use hasSideEffects = 1 for tablegen to allow
+// its use in the readcyclecounter selection.
def S_GETREG_B32 : SOPK_Pseudo <
----------------
This is a problem. ReadCycleCounter should probably be marked IntrNoMem. I've also been thinking we need to introduce memory free versions of setreg pseudos that only touch the mode register
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll:3119
; CGP-NEXT: v_trunc_f32_e32 v4, v4
-; CGP-NEXT: v_mad_f32 v2, -v4, v3, v2
+; CGP-NEXT: v_fma_f32 v2, -v4, v3, v2
; CGP-NEXT: v_cvt_u32_f32_e32 v4, v4
----------------
This shouldn't have changed
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81886/new/
https://reviews.llvm.org/D81886
More information about the llvm-commits
mailing list