[llvm-commits] [PATCH] X86: Update GATHER instructions to support 2 read-write registers
Manman Ren
mren at apple.com
Thu Jul 12 09:58:35 PDT 2012
I went through the td files, and only saw pseudo instructions "ATOMxx" use constraints differently.
All other instructions except GATHER have operand at index 1 tied to operand at index 0.
I think pseudo instructions do not go through the section of code changed in the proposed patches.
Correct me if I am wrong.
Thanks,
Manman
On Jul 11, 2012, at 10:22 PM, Craig Topper wrote:
>
>
> On Wed, Jul 11, 2012 at 10:08 AM, Manman Ren <mren at apple.com> wrote:
>
> Your fix in X86RecognizableInstr.cpp is cleaner and it effectively is reversing the direction of the TIED_TO.
>
> About the following change:
> - if (NumOps > 1 && Desc.getOperandConstraint(1, MCOI::TIED_TO) != -1)
> + if (NumOps > 1 && Desc.getOperandConstraint(1, MCOI::TIED_TO) == 0)
> ++CurOp;
> - else if (NumOps > 2 && Desc.getOperandConstraint(NumOps-1, MCOI::TIED_TO)== 0)
> - // Skip the last source operand that is tied_to the dest reg. e.g. LXADD32
> - --NumOps;
> + else if (NumOps > 3 && Desc.getOperandConstraint(2, MCOI::TIED_TO) == 0 &&
> + Desc.getOperandConstraint(NumOps - 1, MCOI::TIED_TO) == 1)
> + // Special case for GATHER with 2 TIED_TO operands
> + // Skip the first 2 operands: dst, mask_wb
> + CurOp += 2;
>
> Is there a case where we need to skip the last source operand that is tied_to the dest reg?
> Do we only need to handle the case where operand at index 1 is tied to operand at index 0?
> I checked the td file for LXADD32, it seems that the first source operand is tied to the dest reg.
>
> I ran "make check" with an assert(0) in the case that was supposed to handle LXADD32. Looks like the operands for LXADD32 were reversed after the code was added to CodeEmitter.cpp so its effectively dead code now.
>
>
> If there is a case where we need to skip the last source operand, will reversing the direction of the TIED_TO in operandMapping causing a problem?
> Assume we have an instruction with 3 operands, and the last operand is tied to the first operand,
> before the change, we will have physical operands at index 0, 1 and after the change, we will physical operands at index 1 and 2.
> Will that be a problem?
>
> All the tests pass. I'll try to poke around and see if I can find any cases that aren't handled correctly.
>
>
> Thanks a lot,
> Manman
>
> On Jul 11, 2012, at 12:01 AM, Craig Topper wrote:
>
>> I was playing around with this myself this evening. Here's my attempt at a patch. I've merged in some stuff from your patch as well. I've attempted to fix X86RecognizableInstr.cpp in a different way.
>>
>> On Tue, Jul 10, 2012 at 11:10 PM, Manman Ren <mren at apple.com> wrote:
>>
>> GATHER instructions VGATHERxx take 3 operands: op1 is a register operand, op2 is a memory operand, op3 is a register operand.
>> Both op1 and op3 are read-write, after execution of the instruction, both are updated.
>>
>> There are issues related to handling two TIED_TO constraints and how to use TIED_TO constraints to correctly encode/decode VGATHERxx.
>> The discussion at llvmdev is appended to the end of this message.
>>
>> Please review and provide feedback.
>>
>> Thanks,
>> Manman
>>
>>
>> Begin forwarded message:
>>
>>> From: Manman Ren <mren at apple.com>
>>> Subject: Re: [LLVMdev] question on table gen TIED_TO constraint
>>> Date: July 10, 2012 3:52:53 PM PDT
>>> To: Craig Topper <craig.topper at gmail.com>
>>> Cc: Evan Cheng <evan.cheng at apple.com>, "llvmdev at cs.uiuc.edu" <llvmdev at cs.uiuc.edu>
>>>
>>>
>>> Using VEX_4V may not be the right fix since Intel guide says "Operand 3: VEX.vvvv"
>>>
>>> I was talking about this section of code:
>>> void RecognizableInstr::handleOperand(
>>> ...
>>> while (operandMapping[operandIndex] != operandIndex) {
>>> Spec->operands[operandIndex].encoding = ENCODING_DUP;
>>> Spec->operands[operandIndex].type =
>>> (OperandType)(TYPE_DUP0 + operandMapping[operandIndex]);
>>> ++operandIndex;
>>> }
>>> …
>>> }
>>>
>>> void RecognizableInstr::emitInstructionSpecifier(DisassemblerTables &tables) {
>>> ...
>>> case X86Local::MRMSrcMem:
>>> // Operand 1 is a register operand in the Reg/Opcode field.
>>> // Operand 2 is a memory operand (possibly SIB-extended)
>>> // - In AVX, there is a register operand in the VEX.vvvv field here -
>>> // Operand 3 (optional) is an immediate.
>>> ...
>>> HANDLE_OPERAND(roRegister)
>>>
>>> if (HasVEX_4VPrefix)
>>> // FIXME: In AVX, the register below becomes the one encoded
>>> // in ModRMVEX and the one above the one in the VEX.VVVV field
>>> HANDLE_OPERAND(vvvvRegister)
>>>
>>> if (HasMemOp4Prefix)
>>> HANDLE_OPERAND(immediate)
>>>
>>> HANDLE_OPERAND(memory)
>>>
>>> if (HasVEX_4VOp3Prefix)
>>> HANDLE_OPERAND(vvvvRegister)
>>> …
>>> }
>>>
>>> For GATHER with 5 operands (dst, mask_wb, src1, mem, mask), the operandMapping is "0 to 0, 1 to 1, 2 to 0, 3 to 3, 4 to 1", operand 2 is tied to operand 0, operand 4 is tied to operand 1.
>>> So operand 2 and 4 (src1, mask) are treated as DUP, and the physical operands are 0,1,3(dst, mask_wb, mem), while MRSrcMem assumes Reg, Mem, Reg.vvvv if HasVEX_4VOp3Prefix is true.
>>> Same situation happens in X86MCCodeEmitter.cpp, where TIED_TO for operand 2 is operand 0. We can only increment CurOp once for operand 2, since TIED_TO of operand 1 is -1.
>>> What we really want is to reverse the direction of TIED_TO for mask and mask_wb.
>>>
>>> We can probably hack this to handle the special case of two tied-to operands, if we have 2 tied-to operands, handle operand vvvvRegister first, then handle memory operand in MRMSrcMem.
>>> In X86MCCodeEmitter.cpp, we increase CurOp twice if we have 2 tied-to operands. But it is kind of ugly.
>>>
>>> + // FIXME: Handle the special case for GATHER:
>>> + // For GATHER with 5 operands (dst, mask_wb, src1, mem, mask), src1 is tied
>>> + // to dst and mask is tied to mask_wb. The operandMapping is "0 to 0,
>>> + // 1 to 1, 2 to 0, 3 to 3, 4 to 1". The 2nd physical operand is mask_wb, and
>>> + // it is before mem, so we need to explicitly handle vvvvRegister first.
>>> + if (HasVEX_4VOp3Prefix && tiedOperandsCount >= 2)
>>> + HANDLE_OPERAND(vvvvRegister)
>>> +
>>>
>>> Thanks,
>>> Manman
>>>
>>> On Jul 9, 2012, at 11:47 PM, Craig Topper wrote:
>>>
>>>> I don't think changing to VEX_4VOp3 to VEX_4V is the right fix. I think the fix is to increment CurOp twice at the start for these instructions so that only the input operands are used for encoding.
>>>>
>>>> Also, I just submitted a patch to revert the operand order for these instructions in the assembler/disassembler. Destination register should appear on the right and the mask should appear on the left as we use AT&T syntax by default. It will probably conflict with your updates here.
>>>>
>>>> On Mon, Jul 9, 2012 at 11:24 PM, Manman Ren <mren at apple.com> wrote:
>>>>
>>>> Yes, there is an easy way to fix this.
>>>> MRMSrcMem assumes register, memory, vvvv register if VEX_4VOp3 is true and assumes register, vvvv register, memory if VEX_4V is true.
>>>>
>>>> I just need to change the flag from VEX_4VOp3 to VEX_4V. There are a few places where we assume only the 2nd operand can be tied-to:
>>>> Desc->getOperandConstraint(1, MCOI::TIED_TO) != -1 (hard-coded index 1)
>>>> I will fix those to handle this instruction.
>>>>
>>>> Thanks,
>>>> Manman
>>>>
>>>> On Jul 9, 2012, at 10:07 PM, Evan Cheng wrote:
>>>>
>>>> >
>>>> >
>>>> > On Jul 9, 2012, at 4:15 PM, Manman Ren <mren at apple.com> wrote:
>>>> >
>>>> >>
>>>> >> I need to implement an instruction which has 2 read-write registers, so I added
>>>> >> let Constraints = "$src1 = $dst, $mask = $mask_wb" in {
>>>> >> ...
>>>> >> def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb),
>>>> >> (ins VR128:$src1, v128mem:$src2, VR128:$mask),
>>>> >> ...
>>>> >> }
>>>> >> There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand.
>>>> >> See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier.
>>>> >
>>>> > Can this be fixed?
>>>> >
>>>> > Evan
>>>> >
>>>> >> And the above gives us $dst, $mask_wb, $src1, $mem, $mask, and $mask_wb is the second physical operand.
>>>> >>
>>>> >> I thought about using "$mask_wb = $mask", but it breaks the assumption of TIED_TO LhsIdx > RhsIdx.
>>>> >> Is adding another addressing mode a good idea?
>>>> >>
>>>> >> Any pointer is appreciated.
>>>> >> Thanks,
>>>> >> Manman
>>>> >> _______________________________________________
>>>> >> LLVM Developers mailing list
>>>> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>>
>>>> --
>>>> ~Craig
>>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>> --
>> ~Craig
>> <gather.patch>
>
>
>
>
> --
> ~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120712/b5329753/attachment.html>
More information about the llvm-commits
mailing list