[PATCH] D50633: [AMDGPU] Add new Mode Register pass

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 1 11:31:56 PDT 2018


rampitec added a comment.

In https://reviews.llvm.org/D50633#1284259, @timcorringham wrote:

> Yes, I think that your suggestion is the correct solution in a perfect world. It is one of the possible approaches that we discussed in our team before implementing the current proposed solution.
>
> The specific issue we are trying to solve is that the three 16 bit interpolation instructions need a non-default rounding mode. These are not yet widely used, but of course we need to ensure they work when they are used.
>
> Modelling the Mode register as a separate register for each field would allow LLVM to track the values and minimise the number of changes required, and having a dependency to that register would avoid any issues with scheduling. To be complete we would need to add something like 14 separate registers corresponding to the fields within the Mode register, and add the dependencies to those to all the instructions that depend on the settings (lots). We would also need a pass to combine changes to separate fields into a single setreg wherever possible that would probably be something similar to the pass we have now.   This feels like a rather invasive set of changes. This approach would have the advantage that it would probably also resolve the concerns Matt raised. However, we chose not to adopt this approach as we considered the cost-benefit equation to be too  heavy on the cost side.
>
> The approach we have implemented is a compromise that meets our current needs, is extendable for other mode settings should that become necessary, and isn't too invasive. It produces a minimal number of setreg instructions in almost all cases. Running the pass late avoids scheduling issues, but does possibly miss some minor optimization opportunities. However, given the rare occurrence of non-default modes the impact is very small.
>
> Do you think the benefits of the multi-register approach justify the effort required over the current approach?


That is not only few interpolation instructions which need it. We need to implement OpenCL non-default rounding modes for arithmetic instructions. That will require the use of setregs.
In fact I do not see a non-invasive or efficient way to implement OpenCL rounding modes without proper modeling of HWREG and dependencies, because lowering of intrinsics must occur early.
That means we will need to revert any late approach if submitted and reimplement it any way. I.e. I believe this effort is perfectly justified.


Repository:
  rL LLVM

https://reviews.llvm.org/D50633





More information about the llvm-commits mailing list