[RFC, PowerPC] Initial fast-isel support for PPC64

Fri Jul 12 14:36:42 PDT 2013

Hi Bill,

Wow. Very large patch. I've done a glance through it and have a few
odd cleanup comments, but it's also pretty obvious via the FIXMEs that
you know what needs to be cleaned up as well. It'd be nice if you
could split this up a bit more, I've got some questions below.

Testing question:

Have you tested with the test-suite that is in projects/test-suite?
It's really good at finding bugs.

Few nits:

a) FORNOW->FIXME

b) This is worrisome:

+  // For fast-isel, a COPY_TO_REGCLASS may survive this long (necessary
+  // to make machine instruction verification happy wrt register classes
+  // for whatever reason).  FORNOW: Hack to avoid trying to print it
+  // as it should have no effect.  FIXME: What is the real solution?
+  if (MI->getOpcode() == TargetOpcode::COPY_TO_REGCLASS)
+    return;

and, of course, the odd register class issues with extensions.

c) Indenting

+  typedef struct Address {
+    enum {
+      RegBase,
+      FrameIndexBase

d) Block comments on top of functions.

e) No need for these {, I realize you copied them from ARM and I'll fix them.

+    case Instruction::BitCast: {

f) The noreg0 register class stuff is a bit icky, is there some way we
can fix this?

g) Relatedly, the GR8 stuff for 64-bit registers is really confusing,
meta thing for the port of maybe using 8, 16, 32, 64 for the
associated things.

h) Document why you've only added calling conventions for fast isel.

i) Go ahead and remove this until you implement it.

+bool PPCFastISel::SelectIntrinsicCall(const IntrinsicInst &I) {
+  // FIXME: Not yet implemented.
+  return false;
+}
+
+bool PPCFastISel::SelectSelect(const Instruction *I) {
+  // FIXME: Not yet implemented.
+  return false;
+}

j) May as well separate out materializing a 64-bit integer as well :)

Thoughts?

-eric

On Mon, Jun 24, 2013 at 9:19 AM, Bill Schmidt
<wschmidt at linux.vnet.ibm.com> wrote:
> Here's my proposed initial implementation for fast instruction selection
> on PowerPC.  This implementation is strictly for PPC64 ELF (no 32-bit
> support, no Darwin support).
>
> I used the ARM fast-isel support as a template for the PPC64 support.  I
> found the existing code there very helpful, so kudos to Eric and others
> for their work.
>
> Although there's a lot of stuff here, there's a fair amount of work that
> could still be done to catch more cases.  There's some TBD commentary at
> the top of PPCFastISel.cpp indicating the major things that could be
> added.  I'm not sure that I'll have time to work on those issues in the
> near future as I am being pulled in several directions at once right
> now, but hopefully I or someone else can get to them eventually.
>
> Register classes cause some difficulty with the PowerPC implementation.
> There are two main issues:
>
>  - For 64-bit PowerPC, the GPRC register class consists of subregisters
> of the G8RC register class.  32-bit operations produce the GPRC
> registers, and 64-bit operations produce the G8RC registers.  However,
> in reality the registers are 64 bits in length, so that a 32-bit
> sign-extending load will propagate the sign into the upper 32 bits, for
> example.  The DAG selection code is able to unify all this, but when
> generating MIs directly this can create awkwardness.  In order to pass
> machine verification, I created "_32_64" versions of several of the
> 32-bit instructions, which consume GPRC registers and produce G8RC
> registers.
>
>  - To be able to assign X0 (64-bit) or R0 (32-bit) where it's legal to
> do so, each of G8RC and GPRC has a subclass excluding that register.
> These are intersection subclasses named G8RC_and_G8RC_NOX0 and
> GPRC_and_GPRC_NOR0, respectively.  These are needed because register 0
> means the immediate value 0 in several instructions:  as the base
> register of loads and stores; as the first operand in an add-immediate;
> and in the integer select instruction (not yet implemented in
> fast-isel).  Unfortunately the automatically generated portions of
> fast-isel processing use GPRC and G8RC for 32-bit and 64-bit operands
> regardless of this (first register class defined for the value type).
> This requires some extra overrides of the emit routines.
>    Related to this, we cannot in general assign result registers for an
> instruction without thinking about the register-zero question.  Where
> possible, I've chosen to use the currently assigned register for an
> instruction to determine the register class for that instruction's
> result.  So if an instruction has anticipated uses in the same block, we
> know whether or not it can be assigned to R0; if not, we have to
> conservatively assume that it can't be.
>    It might be possible to just override everything so we never assign
> register zero to anything for fast-isel code.  But I'm not sure this
> would work well when we hit something we can't handle and punt to the
> DAG selector.  I'm interested in thoughts from the more knowledgable on
> this.
>
> There are two things I've done in this patch that I consider to be
> bloody hacks, and I am hopeful someone can suggest better alternatives:
>
>  - Calling convention hack:  With this patch, PPCGenCallingConv.inc is
> now included in two places:   PPCISelLowering.cpp and PPCFastISel.cpp.
> Because we don't use the 32-bit support, some functions defined in the
> generated code are unused in PPCFastISel.cpp, causing build failures
> with -Werror.  I also found it necessary to add some fast-isel specific
> calling convention descriptions, which are not used in
> PPCISelLowering.cpp, resulting in the same problem there.  For now, I've
> added a dummy method in each file that creates artificial uses of the
> unused functions to avoid the warnings.  There must be a better way...
>
>  - In PPCFastISel::SelectFPToI(), I have another machine verification
> issue where I need to convert an f32 (F4RC) to an f64 (F8RC).   This is
> a no-op on PowerPC since both values occupy a 64-bit register, with f32
> guaranteeing fewer bits of accuracy.  I looked at the debug output with
> the DAG selector to see what I needed, and it showed a COPY from f32 to
> f64.  However, when I tried to use that, something in the downstream
> machinery quietly changed my copy into an f32->f32 copy, and I still
> failed machine verification.  I circumvented this by using
> COPY_TO_REGCLASS, which works; however, I believe that COPY_TO_REGCLASS
> isn't expected to still exist once we've lowered to MIs.  At least,
> PPCInstPrinter::printInst() was surprised and blew up on that opcode.  I
> added a hack to skip a COPY_TO_REGCLASS during asm-printing since it's a
> no-op, but there really must be a better way to handle this.
>
> OK, enough prologue.  The patch is attached, and I would really
> appreciate some good review before I commit it.  I have a lot of tests
> that all pass with -verify-machineinstrs, but I am sure there are things
> that I have missed, so the more eyes the better.
>
> When committing, I can split out the changes to PPCInstPrinter.cpp and
> PPCInstr{Info,64Bit}.td as separate patches, but everything else is
> pretty strongly tied together.  If you have suggestions for committing
> this in smaller pieces, I welcome those as well.
>
> For those of you who got this far:  thanks for reading.  Definitely
> qualifies as a tl;dr.  But hey, it's nothing compared with the patch. ;)
>
> Thanks,
> Bill