[PATCH] D50575: [AMDGPU] Add support for a16 modifier for gfx9
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 10 12:15:14 PDT 2018
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPU.td:245-246
+def FeatureR128A16 : SubtargetFeature<"a16",
+ "HasR128A16",
+ "true",
----------------
Feature name doesn't match, r128-a16?
================
Comment at: lib/Target/AMDGPU/AMDGPU.td:248
+ "true",
+ "Support 16 bit address types on gfx9"
+>;
----------------
Should be more specific about what kind of address
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4613-4614
}
+ // check for 16 bit addresses and pack if true
+ unsigned DimIdx = AddrIdx + BaseOpcode->NumExtraArgs;
----------------
Capitalize and punctuate
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4618
+ if (VAddrVT.getScalarType() == MVT::f16 &&
+ Subtarget->getGeneration() >= GCNSubtarget::GFX9) {
+ IsA16 = true;
----------------
This should be checking the subtarget feature you added?
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4620
+ IsA16 = true;
+ VAddrs.clear();
+ // push back non-packed 32 bit addresses
----------------
Why is this clearing?
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4621
+ VAddrs.clear();
+ // push back non-packed 32 bit addresses
+ for (unsigned i = 0; i < BaseOpcode->NumExtraArgs; ++i)
----------------
Capitalize etc.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4624
+ VAddrs.push_back(Op.getOperand(AddrIdx + i));
+ // pack derivatives
+ unsigned DimIdx = AddrIdx + BaseOpcode->NumExtraArgs;
----------------
Capitalize etc.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4628-4630
+ SDValue DimX = Op.getOperand(DimIdx + i);
+ SDValue DimY = DAG.getUNDEF(MVT::f16);
+ DimX = DAG.getBuildVector(MVT::v2f16, DL, {DimX, DimY});
----------------
This is a SCALAR_TO_VECTOR
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4631
+ DimX = DAG.getBuildVector(MVT::v2f16, DL, {DimX, DimY});
+ DimX = DAG.getBitcast(MVT::f32, DimX);
+ VAddrs.push_back(DimX);
----------------
Why this bitcast, and why f32? If we're coercing vector types i32 would be correct
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4648
+ VAddrs.push_back(DimZ);
+ i++;
+ }
----------------
I don't understand the overall flow of all of these sections. There 3 nearly identical looking parts for the different NumGradients, and I'm further confused by the manipulation of i inside the loop
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4660-4662
+ AddrLo = DAG.getBuildVector(MVT::v2f16, DL, {AddrLo, AddrHi});
+ AddrLo = DAG.getBitcast(MVT::f32, AddrLo);
+ VAddrs.push_back(AddrLo);
----------------
Ditto
Repository:
rL LLVM
https://reviews.llvm.org/D50575
More information about the llvm-commits
mailing list