[PATCH] D50575: [AMDGPU] Add support for a16 modifier for gfx9

Ryan Taylor via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 13 16:33:05 PDT 2018


rtaylor added inline comments.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4648
+          VAddrs.push_back(DimZ);
+          i++;
+        }
----------------
rtaylor wrote:
> arsenm wrote:
> > I don't understand the overall flow of all of these sections. There 3 nearly identical looking parts for the different NumGradients, and I'm further confused by the manipulation of i inside the loop
> Fair enough, this could probably be consolidated, there is some redundancy, I suppose I could make these functions and call them and pass them, though that seems excessive also. The way the address components are packed differs depending on the dimension and the ordering makes it a bit difficult to cleanly do this. I could also do short sections with long conditions.
> 
> 1D condition packs an undef with dx/dh and an undef with dy/dh.
> 2D condition packs dx/dh and dy/dh together and dx/dt and dy/dt together.
> 3D condition does 2D condition but packs an undef with dz/dh and an undef with dz/dt.
> 
> The ordering all out would be dxdh, dydh, dzdh, dxdt, dydt, dzdt.
Fair enough, this could probably be consolidated, there is some redundancy, I suppose I could make these functions and call them and pass them, though that seems excessive also. The way the address components are packed differs depending on the dimension and the ordering makes it a bit difficult to cleanly do this. I could also do short sections with long conditions.

1D condition packs an undef with dx/dh and an undef with dx/dt.
2D condition packs dx/dh and dy/dh together and dx/dt and dy/dt together.
3D condition does 2D condition but packs an undef with dz/dh and an undef with dz/dt.

The ordering all out would be dxdh, dydh, dzdh, dxdt, dydt, dzdt.


Repository:
  rL LLVM

https://reviews.llvm.org/D50575





More information about the llvm-commits mailing list