[PATCH] D61787: [GlobalISel Legalizer] Improve artifact combiner

Thu Jul 4 06:52:05 PDT 2019

Petar.Avramovic added a comment.

About

> 
> 
>   %1 = G_MERGE_VALUES %0, ...
>   %2 = G_TRUNC %1

Potential problem about such combine is that G_TRUNC could than both perform combine, and be combined away, and one of this two has to be attempted first, and it is not quite clear when we have to wait for other instruction to be legalized.

There is already similar combine that we could possibly use:

  %1 = G_MERGE_VALUES %0, ...
  %2 = G_EXTRACT %1, 0

For example instead of narrowing scalar with G_TRUNC we could G_EXTRACT to narrowType instead.

> There is also another means of doing the right thing. If G_TRUNC correctly specified the min/max types with clampScalar() and a G_TRUNC narrowScalar were implemented as G_UNMERGE_VALUES/G_MERGE_VALUES then the MERGE/UNMERGE would pair up and use the existing combines. Similarly for G_SEXT. The snag there is the one I think you're trying to solve which is that the MERGE/UNMERGE have already left the artifact list and won't retry the combines again.

Oh yes. Let's go with the simplest case for start. I can't deal with e.g. `s64 = G_SEXT s32` or `s32 = G_TRUNC s64` without narrowScalar but then I am left with uncombined G_MERGE etc.Maybe this could be covered with additional Artifact combines?

> I think adding the missing combine(s) is my first preference at the moment but I ought to give it a bit more thought and weigh up the different options.

I am not so sure that combines alone could fix problems we have:
1  In order to be able to perform combine both parts have to be present (both auxiliary artifact and artifact as I named them in this patch). 
Just having the combine available is not of much use when one of the parts is not present. We don't know if we: 
have to wait for other part to appear while there is possibility or
legalize when we know nothing can produce the other part.
2  Some of potential new Artifacts combines are conditional, e.g.

  %1 = G_MERGE_VALUES %0, ...
  %2 = G_TRUNC %1

is only possible when %2 has same LLT as %0.
OK, we could also perform narrowScalar to LLT of %0 when %2 has different LLT first and then combine. 
Then it looks like having combine that does: narrowScalar + combine G_MERGE/G_UNMERGE. There is nothing wrong with this I guess, some combines could be even more complicated but are not really required since they can break down into sequence of already present simpler actions.

Combines we have atm are essentially all we need, using them at the right time in conjunction with narrowScalar or similar should be able to deal with everything. Here I tried to address the 'at the right time'.

> The reason there isn't an MIR test for it there is that it's already covered by the unit tests.

Unit tests are fine, however the instructions that are made by narrowScalar will not combine away because the artifacts that should perform combine were not in the artifact list. When I tried it with s128 G_SEXT_INREG it had uncombined artifacts that are legal, it fallbacks to sdag later if I remember correctly.

> We do a pass over the non-artifacts, then a pass over the artifacts and repeat that cycle until we're finished. There's no actual requirement to do distinct passes like this, it just made it easier to understand, implement, and be sure we were combining away all the problematic artifacts.

I don't agree with "There's no actual requirement"

Current code deals with artifact chained to an instruction because instructions always go first, but does not deal with artifact chained to an artifact.

The "be sure we were combining away all the problematic artifacts" implies a few requirements.
For example G_UNMERGE is most likely chained to some instruction (it could also be chained to an artifact but let's leave such complications for now).
If we want to combine this G_UNMERGE away, instruction has to be legalized first (it will create G_MERGE that G_UNMERGE is waiting for).

If we do artifact combine first most of the artifacts will leave Artifact list as legal instructions and won't attempt to combine again (G_UNMERGE is one of them). That is why we legalize instructions first.
For same reason we combine/legalize instruction/artifacts that USE %xy first before instruction/artifact that DEFs %xy.

Well in fact, some artifact combines will still be performed because e.g. G_TRUNC currently jumps to uses of its operand 0 and tries to perform combine from there but this is somewhat chaotic: legal instruction(uncombined artifact) that is not in artifact List perform combine, not to mention that that we might not be able to deal with it later if it leaves Legalizer despite the fact that it is legal. G_TRUNC jumping was probably added at some point in attempt fix the fact that some chained artifacts could leave the Artifact list (in this patch they have to wait in another list, and we don't ask them if they are legal until we are sure they can't be combined away).

I could make a new RFC thread with a few tests and expected results, to see how to deal with them. We could go with simple ones first, here is one:

  %2:_(s32) = G_ADD %0(s32), %1(s32)
  %3:_(s64) = G_SEXT %2(s32)
  %4:_(s64) = ...
  %5:_(s64) = G_ADD %3(s64), %4(s64)
  ...

for aarch64 just double the size of types. This patch with Narrow scalar G_SEXT would do the trick, but we should explore other possibilities.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61787/new/

https://reviews.llvm.org/D61787