[llvm-commits] [llvm] r47658 - in /llvm/trunk: lib/Target/X86/README-X86-64.txt lib/Target/X86/X86Instr64bit.td test/CodeGen/X86/x86-64-and-mask.ll

Sat Mar 8 14:18:36 PST 2008

On Mar 8, 2008, at 1:55 PM, Christopher Lamb wrote:

Hey Christopher!

> This was originally part of my changes to use sub/super registers on  
> x86, but got nixed by Evan. See this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070730/052377.html

Ok, I didn't remember that.

> The point that Evan made is that implicit zeroing of the upper part  
> of the superregister is target specific to x86-64, thus target  
> independent subreg instructions don't properly capture this behavior.

I agree with that.  Subregs (by themselves) really talk about reading/ 
writing to a sub/superset of another register.  The strange thing here  
is that the operation actually changes all 64 bits, not just the 32  
bits being targeted.

> My argument was/is for having a single input form of insert_subreg  
> which is explicitly there to capture target dependent semantics. The  
> target independent machinery is then free to operate on that single  
> input insert_subreg the same way across all platforms (it could be  
> insert into undef, insert into zero, insert into all ones), but the  
> legality of its usage is up to the specific code generator.

I'm not sure exactly what you mean, but generally it's good for nodes  
to have well defined semantics independent of the target.  Of course,  
this could be done by saying that insert_subreg takes an immediate  
value to indicate which form it is, and only some forms are valid.   
I'm not sure how it simplifies this though.

> Today this single input form of insert_subreg exists, but is unused.  
> Would it be better if were a separate node, rather than being  
> treated differently based on the number of operands?

I'm really not a guru on this sort of stuff.  I am somewhat  
dissatisfied with both ppc64 and x86-64, which have to duplicate a  
number of 32-bit register operation forms when operating on the "low  
part of a 64-bit register".  However, I'm not sure I know a better way  
to model this.

Can you give a concrete example of how this would play out, using  
'mov' below as an example?  Basically in svn today we now have 2  
copies of 32-bit mov, which codegen to the same instruction, but are  
matched at isel time in two different ways.  How would your solution  
change this?

Right now we have:

def MOV32rr : I<0x89, MRMDestReg, (outs GR32:$dst), (ins GR32:$src),
                 "mov{l}\t{$src, $dst|$dst, $src}", []>;
and:

def PsAND64rrFFFFFFFF
   : I<0x89, MRMDestReg, (outs GR64:$dst), (ins GR64:$src),
       "mov{l}\t{${src:subreg32}, ${dst:subreg32}|${dst:subreg32}, $ 
{src:subreg32}}",
       [(set GR64:$dst, (and GR64:$src, i64immFFFFFFFF))]>;

One very simple and nice thing we could do is replace the duplicated  
instruction with a Pat pattern.  This would mean that there is only  
one instruction and the magic just happens in the isel.  This would  
give us something like this:

def : Pat<(and GR64:$src, i64immFFFFFFFF),
           (x86_64_bit_part_of_32_bit
              (MOV32rr (subreg GR64:$src, x86_32bit_part_of_64bit)))>;

This puts more pressure on the coalescer to coalesce away the copies,  
but seems like an overall better solution.  Is this the sort of thing  
you mean?

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20080308/186a0a8e/attachment.html>