[llvm-commits] Fwd: [llvm] r48223 - in /llvm/trunk/lib: CodeGen/LowerSubregs.cpp CodeGen/SelectionDAG/ScheduleDAG.cpp Target/Target.td Target/X86/X86ISelDAGToDAG.cpp Target/X86/X86InstrInfo.h Target/X86/X86InstrInfo.td Target/X86/X86RegisterInfo.h Target/X86/X86RegisterInfo.td

Fri Mar 14 12:16:48 PDT 2008

Ok, how 'bout this:

$dst = insert_subreg $super, $sub, $idx with constraint "$dst = $super"

$dst = combine_subreg $undef_or_imm, $sub, $idx with constraint "$sub  
= subreg($dst, $idx)"

for 'combine_subreg' if the first input operand is a register, then  
it is assumed to contain an undef value, otherwise the first operand  
is an immediate value that represents the implict value being  
inserted into.

This should allow us to still model insert into undef as:

vr1025 = undef
vr1026 = combine_subreg vr1025, vr1024, x86_subreg32bit

As soon as undef live ranges are assigned a random register the fact  
that 1025 and 1026 overlap will not be a problem.

And then implicit zext on x86-64 would be:

vr1025 = combine_subreg 0, vr1024, x86_subreg32bit

1. Requires the new constraint I mentioned.

2. The two-address constraints should seriously simplify  
LowerSubregs, as they won't have to worry about the 'revCopyOrder'  
they do today.

3. The two-address-subreg constraint will allow LowerSubregs to  
coalesce away combine_subreg without needing a target hook, I think.

On Mar 14, 2008, at 10:02 AM, Evan Cheng wrote:

> Think about insert_subreg in the literal sense. It's modifying part  
> of a register. That is, the input and output super register must be  
> the same. Insert into zero would be modeled like this.
>
> vr1024 = mov 0
> vr1024 = insert_subreg vr1024, vr1025, x86_subreg32bit
>
> Ideally we model insert into undef the same way:
>
> vr1024 = insert_subreg undef, vr1025, x86_subreg32bit
>
> However, it's unclear what issues we would run into if  
> insert_subreg is a two address instruction. That's why I suggest  
> making into into a separate insert_subreg_undef:
>
> vr1024 = insert_subreg_undef vr1025, x86_subreg32bit.
>
>
> On Mar 14, 2008, at 1:49 AM, Christopher Lamb wrote:
>
>> 1. insert_subreg into zero, the implicit zext form for x86-64 can  
>> be coalesced too, just more carefully.
>>
>> rax = andq rcx, rax
>> rdx = insert_subreg 0, eax, x86_subreg32bit
>>
>> could become
>>
>> edx = andl ecx, eax
>
> Coalescing away zero extension is tricky:
>
> (zext (and r1, r2))
> =>
> vr1025 = andl vr1025, vr1024
> vr1026 = mov 0
> vr1026 = insert_subreg vr1026, vr1025, x86_subreg32bit
>
> There are two issues with coalescing away the insert_subreg. 1)  
> This insert_subreg isn't like a copy and it's only a *free*  
> instruction on certain targets like x86-64. We would need a target  
> specific hook to tell us it can be coalesced away. It seems like an  
> abuse. 2) The live intervals of vr1025 and vr1026 clearly overlap  
> which prevents

@1 insert/extract_subreg are not introduced by target independent  
code, they can only be introduced by the backend for a specific target.

@2 This is exactly the problem I've been wrestling with from the  
beginning with insert_subreg. You don't actually want to allocate  
registers for implict values.

> So the two address form of insert_subreg clearly cannot be  
> coalesced away.

I assume you mean the two-operand version, rather than the three- 
operand two-address version.

It can't be coalesced away, but it could be removed by LowerSubregs.  
It would require modifying the instruction producing the subreg value  
so that the target specific semantics are still satisfied, though no  
modifications would be necessary for inserting into undef.

Ideally we would have a constaint like two-address for this  
instruction, which says allocate one register in the instruction as a  
subregister of the other:

vr1025 = andl vr1025, vr1024
vr1026 = insert_subreg_undef_or_imm 0, vr1025, x86_subreg32bit

becomes

eax = andl ...
rax = insert_subreg_undef_or_imm 0, eax, x86_subreg32bit

Here LowerSubregs could be taught to not insert the copy. The two  
address logic would allocate a new subregister for the  
insert_subreg... if the andl value was needed more than once, and  
then the insert subreg can likely be removed still (though leaving  
the two-address copy).

> Now let's assume instead of insert_subreg_undef, there is a  
> insert_subreg_undef_or_imm which allows an undef or immediate as  
> the first operand:
>
> (zext (and r1, r2))
> =>
> vr1025 = andl vr1025, vr1024
> vr1026 = insert_subreg_undef_or_imm 0, vr1025, x86_subreg32bit
>
> This *solves* the second problem. I'm still of two minds on whether  
> allowing coalescing of this is a gross abuse.

If you support the "$reg = subreg($dst,  x86_subreg32bit)" constraint  
then I think LowerSubregs can do the work based on a target hook.

>> Is this optimization important? Does making an explicit  
>> insert_subreg_undef prevent this?
>>
>> 2. Are you saying that the normal insert_subreg instruction should  
>> have the two-address constraint? Is this simply to simplify  
>> coalescing support? How would insert into zero work in that case?
>
> Yes, this is the *literal* form of insert_subreg. The output and  
> input operand 1 should be the same. Except that's not an  
> optimization rather than a two-address property enforced by the  
> register allocator.
>
>>
>> I think the way subreg's were modeled in my patch is very much in  
>> the spirit of LLVM's design philosophy, but I don't know if  
>> modeling it that way causes much pain in the coalescing support.  
>> Is this why you don't think it's modeled right?
>
> We are getting very close. :-) One more iteration we should be  
> getting this right.
>
> Thanks,
>
> Evan
>
>> --
>> Chris
>>
>> On Mar 13, 2008, at 8:12 PM, Evan Cheng wrote:
>>
>>> Actually, I think this is wrong. I've been thinking about the  
>>> insert_subreg stuff and I think the way it's modeled is wrong.
>>>
>>> It should be split into 2 types:
>>>
>>> 1. insert_subreg_undef - this is the two operand form. It's kind  
>>> of like anyext but coalescable.
>>> 2. insert_subreg - this is the three operand form. It should be a  
>>> two-address instruction.
>>>
>>> What do you think?
>>>
>>> Evan
>>>
>>> On Mar 13, 2008, at 6:48 PM, Evan Cheng wrote:
>>>
>>>> Ah, this independent of insert_subreg coalescing support. I've  
>>>> just added an undef MachineOperandType. If you use that instead  
>>>> of a target specific undef instruction, that should fix this  
>>>> problem.
>>>>
>>>> Then I'll add coalescer support which should eliminate 8-bit  
>>>> move as well.
>>>>
>>>> Thanks,
>>>>
>>>> Evan
>>>>
>>>> On Mar 13, 2008, at 1:58 AM, Christopher Lamb wrote:
>>>>
>>>>>
>>>>>
>>>>> Begin forwarded message:
>>>>>
>>>>>> From: Christopher Lamb <christopher.lamb at gmail.com>
>>>>>> Date: March 13, 2008 1:58:00 AM PDT
>>>>>> To: Evan Cheng <evan.cheng at apple.com>
>>>>>> Subject: Re: [llvm-commits] [llvm] r48223 - in /llvm/trunk/ 
>>>>>> lib: CodeGen/LowerSubregs.cpp	CodeGen/SelectionDAG/ 
>>>>>> ScheduleDAG.cpp	Target/Target.td Target/X86/ 
>>>>>> X86ISelDAGToDAG.cpp	Target/X86/X86InstrInfo.h Target/X86/ 
>>>>>> X86InstrInfo.td	Target/X86/X86RegisterInfo.h Target/X86/ 
>>>>>> X86RegisterInfo.td
>>>>>>
>>>>>> Patch attached. It's running through the nightly test tonight.
>>>>>>
>>>>>>
>>>>> <ins_subr.patch>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mar 13, 2008, at 1:06 AM, Evan Cheng wrote:
>>>>>>
>>>>>>> This is with your upcoming patch? Can you send it to me so I  
>>>>>>> can look into coalescing insert_subreg?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Evan
>>>>>>>
>>>>>>> On Mar 13, 2008, at 12:36 AM, Christopher Lamb wrote:
>>>>>>>
>>>>>>>> Until the coalescer optimizes insert_subreg, there will be  
>>>>>>>> some pessimization in the code generated. For example this  
>>>>>>>> from CodeGen/X86/2007-09-05-InvalidAsm.ll
>>>>>>>>
>>>>>>>>         #IMPLICIT_DEF EAX
>>>>>>>>         mov     ECX, EAX
>>>>>>>>         mov     CX, SI
>>>>>>>>         lea     ECX, QWORD PTR [8*RCX]
>>>>>>>>
>>>>>>>> Prior to explicitly modeling the undef value we were getting:
>>>>>>>>
>>>>>>>>         mov     CX, SI
>>>>>>>>         lea     ECX, QWORD PTR [8*RCX]
>>>>>>>>
>>>>>>>> --
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On Mar 12, 2008, at 11:39 PM, Evan Cheng wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mar 12, 2008, at 11:33 PM, Christopher Lamb wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mar 12, 2008, at 11:23 PM, Evan Cheng wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mar 12, 2008, at 9:56 PM, Christopher Lamb wrote:
>>>>>>>>>>>
>>>>>>>>>>>> This is hypothetical, but what happens if some target  
>>>>>>>>>>>> supports sign extension on insert?
>>>>>>>>>>>
>>>>>>>>>>> Then I would said insert_subreg isn't a good way to model  
>>>>>>>>>>> it. It's more than just inserting a value in that case.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I agree allowing an undef node would be nice.
>>>>>>>>>>>
>>>>>>>>>>> undef and immediate (in additional to a register) should  
>>>>>>>>>>> cover all the cases we care about.
>>>>>>>>>>
>>>>>>>>>> This sounds good. I'm working up a patch now. =)
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Evan
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>>>> On Mar 12, 2008, at 12:12 AM, Evan Cheng wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.h
>>>>>>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/ 
>>>>>>>>>>>>>> lib/Target/X86/X86InstrInfo.h? 
>>>>>>>>>>>>>> rev=48223&r1=48222&r2=48223&view=diff
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> ========================================================= 
>>>>>>>>>>>>>> =============
>>>>>>>>>>>>>> --- llvm/trunk/lib/Target/X86/X86InstrInfo.h (original)
>>>>>>>>>>>>>> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.h Tue Mar  
>>>>>>>>>>>>>> 11 05:09:17 2008
>>>>>>>>>>>>>> @@ -46,6 +46,14 @@
>>>>>>>>>>>>>>     COND_INVALID
>>>>>>>>>>>>>>   };
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +  // X86 specific implict values used for subregister  
>>>>>>>>>>>>>> inserts.
>>>>>>>>>>>>>> +  // This can be used to model the fact that x86-64  
>>>>>>>>>>>>>> by default
>>>>>>>>>>>>>> +  // inserts 32-bit values into 64-bit registers  
>>>>>>>>>>>>>> implicitly
>>>>>>>>>>>>>> containing zeros.
>>>>>>>>>>>>>> +  enum ImplicitVal {
>>>>>>>>>>>>>> +    IMPL_VAL_UNDEF = 0,
>>>>>>>>>>>>>> +    IMPL_VAL_ZERO  = 1
>>>>>>>>>>>>>> +  };
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> Rather than some magic target specific implicit value.  
>>>>>>>>>>>>> Why not just
>>>>>>>>>>>>> allow ISD::UNDEF? That is, allow the superreg operand  
>>>>>>>>>>>>> of INSERT_SUBREG
>>>>>>>>>>>>> as either a register, an immediate, or a ISD::UNDEF node?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Evan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   // Turn condition code into conditional branch opcode.
>>>>>>>>>>>>>>   unsigned GetCondBranchFromCond(CondCode CC);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td
>>>>>>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/ 
>>>>>>>>>>>>>> lib/Target/X86/X86InstrInfo.td? 
>>>>>>>>>>>>>> rev=48223&r1=48222&r2=48223&view=diff
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> ========================================================= 
>>>>>>>>>>>>>> =============
>>>>>>>>>>>>>> --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original)
>>>>>>>>>>>>>> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Tue Mar  
>>>>>>>>>>>>>> 11 05:09:17 2008
>>>>>>>>>>>>>> @@ -161,6 +161,10 @@
>>>>>>>>>>>>>> // Branch targets have OtherVT type.
>>>>>>>>>>>>>> def brtarget : Operand<OtherVT>;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +// These should match the enum X86::ImplicitVal
>>>>>>>>>>>>>> +def x86_impl_val_undef : PatLeaf<(i32 0)>;
>>>>>>>>>>>>>> +def x86_impl_val_zero  : PatLeaf<(i32 1)>;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> //
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> --------------------------------------------------------- 
>>>>>>>>>>>>>> -------------=
>>>>>>>>>>>>>> ==//
>>>>>>>>>>>>>> // X86 Complex Pattern Definitions.
>>>>>>>>>>>>>> //
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.h
>>>>>>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/ 
>>>>>>>>>>>>>> lib/Target/X86/X86RegisterInfo.h? 
>>>>>>>>>>>>>> rev=48223&r1=48222&r2=48223&view=diff
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> ========================================================= 
>>>>>>>>>>>>>> =============
>>>>>>>>>>>>>> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.h  
>>>>>>>>>>>>>> (original)
>>>>>>>>>>>>>> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.h Tue  
>>>>>>>>>>>>>> Mar 11 05:09:17
>>>>>>>>>>>>>> 2008
>>>>>>>>>>>>>> @@ -32,6 +32,15 @@
>>>>>>>>>>>>>>   };
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +namespace X86 {
>>>>>>>>>>>>>> +  /// SubregIndex - The index of various sized  
>>>>>>>>>>>>>> subregister classes.
>>>>>>>>>>>>>> Note that
>>>>>>>>>>>>>> +  /// these indices must be kept in sync with the  
>>>>>>>>>>>>>> class indices in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> +  /// X86RegisterInfo.td file.
>>>>>>>>>>>>>> +  enum SubregIndex {
>>>>>>>>>>>>>> +    SUBREG_8BIT = 1, SUBREG_16BIT = 2, SUBREG_32BIT = 3
>>>>>>>>>>>>>> +  };
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> /// DWARFFlavour - Flavour of dwarf regnumbers
>>>>>>>>>>>>>> ///
>>>>>>>>>>>>>> namespace DWARFFlavour {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.td
>>>>>>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/ 
>>>>>>>>>>>>>> lib/Target/X86/X86RegisterInfo.td? 
>>>>>>>>>>>>>> rev=48223&r1=48222&r2=48223&view=diff
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> =
>>>>>>>>>>>>>> ========================================================= 
>>>>>>>>>>>>>> =============
>>>>>>>>>>>>>> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.td  
>>>>>>>>>>>>>> (original)
>>>>>>>>>>>>>> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.td Tue  
>>>>>>>>>>>>>> Mar 11 05:09:17
>>>>>>>>>>>>>> 2008
>>>>>>>>>>>>>> @@ -176,6 +176,10 @@
>>>>>>>>>>>>>> // sub registers for each register.
>>>>>>>>>>>>>> //
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +def x86_subreg_8bit    : PatLeaf<(i32 1)>;
>>>>>>>>>>>>>> +def x86_subreg_16bit   : PatLeaf<(i32 2)>;
>>>>>>>>>>>>>> +def x86_subreg_32bit   : PatLeaf<(i32 3)>;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> def : SubRegSet<1, [AX, CX, DX, BX, SP,  BP,  SI,  DI,
>>>>>>>>>>>>>>                     R8W, R9W, R10W, R11W, R12W, R13W,  
>>>>>>>>>>>>>> R14W, R15W],
>>>>>>>>>>>>>>                    [AL, CL, DL, BL, SPL, BPL, SIL, DIL,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Christopher Lamb
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Christopher Lamb
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> llvm-commits mailing list
>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> llvm-commits mailing list
>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>
>>>>>> --
>>>>>> Christopher Lamb
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Christopher Lamb
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>> --
>> Christopher Lamb
>>
>>
>>
>

--
Christopher Lamb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20080314/42c038e1/attachment.html>