[llvm-dev] GlobalISel legalization guarantees

Tim Northover via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 19 14:08:32 PDT 2017


Hi all,

I've been thinking about what guarantees the legalization pass needs
to make in GlobalISel and how it should do it. Turns out it's really
quite difficult so I thought I'd send a message out to the list with
the current status and my thoughts for the future.

It got really quite long unfortunately, so the important bit is
probably the second section "High level solutions" on scary changes
that I can't see a way around.

Does anyone else either see issues I've missed with guaranteeing
legalization, or see simpler solutions (please please please)?

Cheers.

Tim.

==========
Background
==========

By some means we need the legalizer to be able to completely remove all illegal
types (a nebulous concept, but it should be obvious that you don't want to try
RegBankSelect or ISel on a s1234). However, each stepwise change of the
legalizer has the input and output VRegs of an instruction as its interface so
it cannot remove these types.

So, the bulk of legalization for larger types is expressed in terms of
G_MERGE_VALUES and G_UNMERGE_VALUES instructions, with the intent that a final
combining pass will be run after the main legalization to remove matching pairs
and leave the MachineFunction legal.

However, it does an incomplete job right now.

In ~630 cases (in the test-suite at -O0) we have a G_UNMERGE_VALUES but it can't
be matched up with G_MERGE_VALUES and removed:

600 G_UNMERGE hits a PHI
30  G_UNMERGE hits an AArch64 intrinsic (e.g. @llvm.aarch64.ld2)

On the other hand about 8000 G_MERGE_VALUES remain after the combining that
don't match a G_UNMERGE_VALUES:

5300 used by a G_STORE
3090 used by a COPY (to physregs or dead?).
60 used by a G_EXTRACT
15 are dead

There are also more or less theoretical issues that don't arise in the
test-suite but need to be resolved. In particular even when a G_MERGE_VALUES is
consumed by a G_UNMERGE_VALUES the input types of the merge may not be the same
as the outputs of the unmerge. The only way to resolve this conflict is by
inserting explicit extract/insert instructions which may or may not themselves
be legal (at least in principle).

====================
High level solutions
====================

Unless anyone comes up with more bright ideas, I think we can see the following
complications before we get to the state where the legalizer has a hope of
replacing SDAG with a straight face:

1. G_MERGE_VALUES and G_UNMERGE_VALUES will not always be eliminated and must be
   mappable+selectable to complex multi-instruction, multi-register
   copy/inserts.
2. We must have a complex enough combiner in the legalizer pass alone to perform
   multi-step legalization intermixed with combines in the hope that we reach a
   fixed point.

Below is a bit of expanded detail on the issues found by the test-suite
above. I'm not quite sure how coherent it is, but it made sense to me so I can
probably try to reword if anything is unclear.

=========================================
Expanded description of test-suite issues
=========================================

G_STORE and G_EXTRACT
---------------------

These are instances of a merge feeding legal instructions. I believe this has to
be permitted but it means that selection is going to contend with a wide range
of possible different merge instructions. At best all legal input and output
register classes must be handled.

On AArch64 the likely list is:

  + "mov vD.2d[0], dN; mov vD.2d[1], dM" (and s variants with 4 insts).
  + "mov vD.4s[0], wN, ..." (and d variants).
  + "ubfi xD, xN, #0, #32; ubfi xD, xM, #32, #32"
  + "COPY; COPY; COPY" or multiples of the above when stitching together a QQQ.

Machine intrinsics
------------------

The versions of this problem I've seen come from code like this::

  {<2 x i32>, <2 x i32>, <2 x i32>} %val = call {...} @llvm.aarch64.ld3(%ptr)
  %vec0 = extractelement {...} %val, 1

which becomes::

  s192 = G_INTRINSIC_W_SIDE_EFFECTS ...
  s64, s64, s64 = G_UNMERGE_VALUES s192

Because intrinsics have to have sane types (they're not really legalized
currently, even in SDag) this is very much like the issue with G_STORE and
G_EXTRACT above. We probably have to support RegBankSelect and ISel on this kind
of G_UNMERGE_VALUES, which becomes a tractable but complex sequence of copies.

PHIs
----

For example:

::
   s128 = PHI [%v0, BB#0], [%v1, BB#1]
   s64, s64 = G_UNMERGE_VALUES s128

I believe we have to split the PHI into sizes corresponding to the
G_UNMERGE_VALUES. If these correspond to the sizes of the (expected) merges on
the other side that's not too bad::

  s64 = PHI [%v0.0, BB#0], [%v1.0, BB#1]
  s64 = PHI [%v0.1, BB#0], [%v1.1, BB#1]

On the other hand we also have the possibility of mismatching types here, with
the added fun that PHI instructions have to be the first in their block.

I believe we'll have to insert messy extract/insert code into a separate,
fallthrough block just before this one. The main question is whether it should
be done as part of the main legalization loop or in the post-processing pass.

Since we have to deal with illegal extracts & inserts anyway, delaying PHIs
until everything else is decided seems reasonable to me.

Mismatching types in paired merge/unmerge
-----------------------------------------

Friendly example::

  s384 = G_MERGE_VALUES s192, s192
  s128, s128, s128 = G_UNMERGE_VALUES s384

Both of these instructions have to go (s384 is likely horribly illegal), but
they can't simply be combined away. The first thing to try is converting to
extracts & inserts::

  s128 = G_EXTRACT s192, 0

  s64 = G_EXTRACT s192, 128
  s64 = G_EXTRACT s192, 0
  s128 = G_IMPLICIT_DEF
  s128 = G_INSERT s128, s64, 0
  s128 = G_INSERT s128, s64, 64

  s128 = G_EXTRACT s192, 64

The good side of this is that I believe it is capable of eliminating the bad
s384 type. The bad side is that we have no reason to expect the actual extract
and insert operations to be legal. Each one potentially corresponds to some
reasonably complex shifting and masking across multiple registers.

To get around this we could either demand that extract and insert **are** legal
whenever a type has an available register class, or attempt to legalize them in
turn (leading to more merges and a fixed-point at the end if we're lucky).

Used by COPY
------------

Too many to be sure what's going on, but a sampling suggests most are either
ending up in PhysRegs (fine) or dead.

A merge being used by dead COPYs could be a problem.


Dead instructions
-----------------

Not a concern. Delete them if we want, or maybe ignore them.


More information about the llvm-dev mailing list