[PATCH] D73446: AMDGPU/GlobalISel: Legalize a16 images

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 30 06:59:00 PST 2020


arsenm added a comment.

In D73446#1848869 <https://reviews.llvm.org/D73446#1848869>, @nhaehnle wrote:

> In D73446#1846711 <https://reviews.llvm.org/D73446#1846711>, @arsenm wrote:
>
> > In D73446#1846283 <https://reviews.llvm.org/D73446#1846283>, @nhaehnle wrote:
> >
> > > In D73446#1842457 <https://reviews.llvm.org/D73446#1842457>, @arsenm wrote:
> > >
> > > > In D73446#1842443 <https://reviews.llvm.org/D73446#1842443>, @arsenm wrote:
> > > >
> > > > > In D73446#1842422 <https://reviews.llvm.org/D73446#1842422>, @nhaehnle wrote:
> > > > >
> > > > > > Why is this conceptually a legalization rather than part of the instruction selection?
> > > > >
> > > > >
> > > > > The register layout packing/unpacking code should be exposed to the legalizer and combines. Doing it later means it could possibly be in a waterfall loop, and won't benefit from the combines on the pack/unpack code.
> > > >
> > > >
> > > > More abstractly, selection should operate on the types expected for the final instruction. It's the legalizer's job to get registers that are the right type/size
> > >
> > >
> > > The part that makes me uncomfortable here and I couldn't quite put a finger on initially is that it implies a change in the semantics of the intrinsic. What this and related changes are doing is implicitly changing the design such that the intrinsics mean something different before vs. after legalization. That is not what legalization is usually supposed to do.
> >
> >
> > Ideally we should have separate intermediate image opcodes. However, I have no interest in creating another giant set of image opcodes that will need more searchable tables. I don't think it's really a semantic change, and it should be possible to go backwards from the legalized intrinsic to what it was originally. What I am worried about is what happens if you run the legalizer twice, which should be OK but I'm ignoring this problem for now until I have the full selection working. Two options I've considered are re-using some of the extra bits in one of the immediate arguments to encode how it's been changed, or to use a variadic wrapper instruction which will just capture the intrinsic ID and how it was legalized.
>
>
> Big "no" on another set of image opcodes from my side as well.
>
> >> I see your point about combiner passes -- so could we perhaps just select the image instructions early instead? It's not like we have very interesting things happening for image instructions in the ISel patterns anyway, and one of the benefits of the GlobalISel infrastructure is that it's supposed to be flexible enough for stuff like that...
> > 
> > I think the only reasonable place outside of selection to do this would be in RegBankSelect/applyMappingImpl, which I did consider before going with the legalizer. We still need to RegBankSelect the intrinsics, and possibly move them into a waterfall loop, and I think introducing real register class constraints earlier is generally undesirable. Eventually, we should run some combiner pass after which should take care of packing code. We could theoretically trick RegBankSelect into handling the selected instructions, but that would also be pretty disgusting. Another small issue with this is it sort of implies making the full set of legalization artifacts legal for <3 x f16> cases. I started working towards this in D72639 <https://reviews.llvm.org/D72639>, but I don't really like it and would prefer if these were all legalized to <4 x s16> during legalization. I think we could use the artifact combiner to eliminate the illegal register types in RegBankSelect, but I don't think that should be mandatory. We claim some of these these are legal today, but that's mostly a hack to deal with missing features in the legalizer.
>
> Part of the problem here is that the final machine instructions have register class constraints in the first place. I've been wondering for some time now what register classes even buy us in the end. It seems to me that they're largely useless, and almost everything we need from a conceptual point of view is contained in the register banks. The few complications around M0, SCC, VCC, could be dealt with explicitly since we largely shouldn't allocate them using a generic approach anyway.


The register classes aren't really useless, and we do have a variety of more exotic operand constraints to deal with. We need them to represent cases like operands that don't support M0/exec/whatever, and cases like SReg_96 only supporting one 64-bit subregister. VCC is also a completely normal, allocatable register. The class constraints don't even necessarily matter when allocating, but when folding copies between classes.

> Is there a way to just relax those constraints for a select subset of opcodes, like the image opcodes?

I don't know what this really would mean


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73446/new/

https://reviews.llvm.org/D73446





More information about the llvm-commits mailing list