[PATCH] D46365: AMDGPU: Separate R600 and GCN TableGen files
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 23 23:57:38 PDT 2018
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUInstrInfo.cpp:27
// Pin the vtable to this file.
-void AMDGPUInstrInfo::anchor() {}
-
-AMDGPUInstrInfo::AMDGPUInstrInfo(const AMDGPUSubtarget &ST)
- : AMDGPUGenInstrInfo(AMDGPU::ADJCALLSTACKUP, AMDGPU::ADJCALLSTACKDOWN),
- ST(ST),
- AMDGPUASI(ST.getAMDGPUAS()) {}
-
-// FIXME: This behaves strangely. If, for example, you have 32 load + stores,
-// the first 16 loads will be interleaved with the stores, and the next 16 will
-// be clustered as expected. It should really split into 2 16 store batches.
-//
-// Loads are clustered until this returns false, rather than trying to schedule
-// groups of stores. This also means we have to deal with saying different
-// address space loads should be clustered, and ones which might cause bank
-// conflicts.
-//
-// This might be deprecated so it might not be worth that much effort to fix.
-bool AMDGPUInstrInfo::shouldScheduleLoadsNear(SDNode *Load0, SDNode *Load1,
- int64_t Offset0, int64_t Offset1,
- unsigned NumLoads) const {
- assert(Offset1 > Offset0 &&
- "Second offset should be larger than first offset!");
- // If we have less than 16 loads in a row, and the offsets are within 64
- // bytes, then schedule together.
-
- // A cacheline is 64 bytes (for global memory).
- return (NumLoads <= 16 && (Offset1 - Offset0) < 64);
-}
-
-// This must be kept in sync with the SIEncodingFamily class in SIInstrInfo.td
-enum SIEncodingFamily {
- SI = 0,
- VI = 1,
- SDWA = 2,
- SDWA9 = 3,
- GFX80 = 4,
- GFX9 = 5
-};
+//void AMDGPUInstrInfo::anchor() {}
----------------
Commented out code
================
Comment at: lib/Target/AMDGPU/AMDGPUInstructions.td:49
+
+class ILFormat<dag outs, dag ins, string asmstr, list<dag> pattern>
+: Instruction {
----------------
Should probably rename this at some point
================
Comment at: lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:127-130
+ if (ST.getTargetTriple().getArch() == Triple::amdgcn)
+ MCOp = MCOperand::createReg(AMDGPU::getMCReg(MO.getReg(), ST));
+ else
+ MCOp = MCOperand::createReg(MO.getReg());
----------------
Why is there a difference here?
================
Comment at: lib/Target/AMDGPU/AMDGPUSubtarget.h:207-208
+
+class AMDGPUSubtarget : public AMDGPUGenSubtargetInfo,
+ public AMDGPUCommonSubtarget {
public:
----------------
Why isn't this SISubtarget/GCNSubtarget?
================
Comment at: lib/Target/AMDGPU/AMDGPUSubtarget.h:436-438
+ virtual bool hasAddr64() const {
+ return (getGeneration() < AMDGPUSubtarget::VOLCANIC_ISLANDS);
}
----------------
Why is this needed outside of GCN code?
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:580
Type *SubTp) {
- if (ST->hasVOP3PInsts()) {
+ if (isGCN() && static_cast<const AMDGPUSubtarget*>(ST)->hasVOP3PInsts()) {
VectorType *VT = cast<VectorType>(Tp);
----------------
Can’t this be in a gcn class? I think all of this is for packed anyway
================
Comment at: lib/Target/AMDGPU/R600ISelLowering.cpp:1584-1586
+ case CallingConv::C:
+ case CallingConv::Fast:
+ case CallingConv::Cold:
----------------
Probably should reject these, but that's a separate change
================
Comment at: lib/Target/AMDGPU/R600IntrinsicInfo.cpp:23-24
+
+R600IntrinsicInfo::R600IntrinsicInfo()
+ : TargetIntrinsicInfo() {}
+
----------------
We can probably drop the whole class. We don't have very many intrinsic definitions left in the backend, and this code never really worked well to begin with
================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:929-930
// with knowledge of the called routines.
- if (MI.getOpcode() == AMDGPU::RETURN ||
+ if (
MI.getOpcode() == AMDGPU::SI_RETURN_TO_EPILOG ||
MI.getOpcode() == AMDGPU::S_SETPC_B64_return) {
----------------
Fix formatting
Repository:
rL LLVM
https://reviews.llvm.org/D46365
More information about the llvm-commits
mailing list