[llvm-commits] Optimization for SIGN_EXTEND on AVX and AVX2 - please review
Duncan Sands
baldrick at free.fr
Wed Jan 16 00:14:09 PST 2013
Hi Elena,
On 15/01/13 14:32, Demikhovsky, Elena wrote:
> v8i16 -VPMOVSXWD> v8i32 works only for AVX2. On AVX machine I have only v8i16 -VPMOVSXWD> v4i32.
>
> The logic of shuffles is implemented in LowerSIGN_EXTEND().
can't you call LowerSIGN_EXTEND directly then?
Ciao, Duncan.
>
> - Elena
>
> -----Original Message-----
> From: Duncan Sands [mailto:duncan.sands at gmail.com] On Behalf Of Duncan Sands
> Sent: Tuesday, January 15, 2013 15:24
> To: Demikhovsky, Elena
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and AVX2 - please review
>
> Hi Elena,
>
> On 15/01/13 14:09, Demikhovsky, Elena wrote:
>>> X86 turns it into v8i8 -> v8i16 ->v8i32 ->v8i64
>>
>> DAGCombiner takes the last v8i32 ->v8i64, combines with v8i16 ->v8i32. Than it sees a new node again. Takes v8i16 -> v8i64 and combines with v8i8 -> v8i16.
>> And I have an endless loop that splits and joins SEXT nodes.
>> I looked for a "delimiter" node that can be inserted between nodes will disappear at the end, but I did not find such.
>
> thanks for explaining. Maybe you can generate
> v8i8 -> v8i16 -VPMOVSXWD> v8i32 ->v8i64 i.e. directly use a target instruction in the middle rather than an sext?
>
> Ciao, Duncan.
>
>>
>> - Elena
>>
>> -----Original Message-----
>> From: Duncan Sands [mailto:duncan.sands at gmail.com] On Behalf Of Duncan
>> Sands
>> Sent: Tuesday, January 15, 2013 14:49
>> To: Demikhovsky, Elena
>> Cc: llvm-commits at cs.uiuc.edu
>> Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and
>> AVX2 - please review
>>
>> Hi Elena,
>>
>> On 15/01/13 13:42, Demikhovsky, Elena wrote:
>>>> why don't you have the X86 target turn this immediately into a VPMOVSXBW, VPMOVSXWD, VPMOVSXDQ combination, rather than going through all of these intermediate steps?
>>>
>>> I already have the code that does the work in LowerSIGN_EXTEND(). It is a little bit more than just VPMOVSX.
>>> Combining load with "sext v8i8 -> v8i16" happens in the common DAGCombine. I don't want to duplicate it too.
>>
>> OK, I misunderstood the role of the load at first.
>>
>>>
>>>> I don't like turning off generally useful transforms like sext(sext)->sext.
>>> I don't turn it off, just do it after type legalizer.
>>
>> I still didn't understand why this is needed. I'm imagining this: the
>> DAG combiner runs, and calls the X86 combiner on v8i8 - > v8i64. X86
>> turns it into v8i8 -> v8i16 ->v8i32 ->v8i64. As the DAG/X86 combiner
>> runs until there is no more work to do, the DAG combiner should then
>> combine the load with the v8i8
>> -> v8i16 and so on until you only have the target instructions you
>> -> described
>> (VPMOVSXBW etc). So no-one outside of the DAG combiner, in particular the type legalizer, should ever see the intermediate v8i8 -> v8i16 ->v8i32 ->v8i64 form.
>> Is this not how it happens?
>>
>> Ciao, Duncan.
>>
>>>
>>> - Elena
>>>
>>> -----Original Message-----
>>> From: Duncan Sands [mailto:duncan.sands at gmail.com] On Behalf Of
>>> Duncan Sands
>>> Sent: Tuesday, January 15, 2013 14:26
>>> To: Demikhovsky, Elena
>>> Cc: llvm-commits at cs.uiuc.edu
>>> Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and
>>> AVX2 - please review
>>>
>>> Hi Elena,
>>>
>>> On 15/01/13 13:21, Demikhovsky, Elena wrote:
>>>> I put 2 or 3 sequential sign-extend operations.
>>>>
>>>> original: v8i8 - > v8i64
>>>
>>> why don't you have the X86 target turn this immediately into a VPMOVSXBW, VPMOVSXWD, VPMOVSXDQ combination, rather than going through all of these intermediate steps?
>>>
>>> I don't like turning off generally useful transforms like sext(sext)->sext.
>>>
>>> Ciao, Duncan.
>>>
>>>> new: v8i8 -> v8i16 ->v8i32 ->v8i64
>>>>
>>>> I combine load with v8i8 -> v8i16 and use VPMOVSXBW instruction.
>>>> I translate v8i16 ->v8i32 to 2 VPMOVSXWD instructions.
>>>> And then I have 4 VPMOVSXDQ for v8i32 ->v8i64.
>>>>
>>>> if I work with one SIGN_EXTEND operation with illegal types, I receive a scalar code after type legalizer.
>>>>
>>>> - Elena
>>>>
>>>> -----Original Message-----
>>>> From: llvm-commits-bounces at cs.uiuc.edu
>>>> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
>>>> Sent: Tuesday, January 15, 2013 13:58
>>>> To: llvm-commits at cs.uiuc.edu
>>>> Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and
>>>> AVX2 - please review
>>>>
>>>> Hi Elena,
>>>>
>>>> On 15/01/13 12:47, Demikhovsky, Elena wrote:
>>>>> I optimized the following SEXT pairs:
>>>>> v8i8 -> v8i64
>>>>> v8i8->v8i32
>>>>> v4i8 ->v4i64
>>>>> v4i16->v4i64
>>>>
>>>>> Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>>>> ===================================================================
>>>>> --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 172422)
>>>>> +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy)
>>>>> @@ -4298,12 +4298,13 @@
>>>>> if (isa<ConstantSDNode>(N0))
>>>>> return DAG.getNode(ISD::SIGN_EXTEND, N->getDebugLoc(), VT,
>>>>> N0);
>>>>>
>>>>> - // fold (sext (sext x)) -> (sext x)
>>>>> - // fold (sext (aext x)) -> (sext x)
>>>>> - if (N0.getOpcode() == ISD::SIGN_EXTEND || N0.getOpcode() == ISD::ANY_EXTEND)
>>>>> - return DAG.getNode(ISD::SIGN_EXTEND, N->getDebugLoc(), VT,
>>>>> - N0.getOperand(0));
>>>>> -
>>>>> + if (Level >= AfterLegalizeTypes) {
>>>>> + // fold (sext (sext x)) -> (sext x)
>>>>> + // fold (sext (aext x)) -> (sext x)
>>>>> + if (N0.getOpcode() == ISD::SIGN_EXTEND || N0.getOpcode() == ISD::ANY_EXTEND)
>>>>> + return DAG.getNode(ISD::SIGN_EXTEND, N->getDebugLoc(), VT,
>>>>> + N0.getOperand(0)); }
>>>>
>>>> why is this needed? As the transform doesn't introduce any illegal types that weren't there already it should be safe to perform at any level.
>>>>
>>>> If you are worried that it might create an illegal operation, then that is what you should check for, not the Level.
>>>>
>>>>> if (N0.getOpcode() == ISD::TRUNCATE) {
>>>>> // fold (sext (truncate (load x))) -> (sext (smaller load x))
>>>>> // fold (sext (truncate (srl (load x), c))) -> (sext
>>>>> (smaller load (x+c/n)))
>>>>
>>>> ...
>>>>
>>>>> --- lib/CodeGen/SelectionDAG/SelectionDAG.cpp (revision 172422)
>>>>> +++ lib/CodeGen/SelectionDAG/SelectionDAG.cpp (working copy)
>>>>> @@ -2554,9 +2554,7 @@
>>>>> VT.getVectorNumElements() ==
>>>>> Operand.getValueType().getVectorNumElements()) &&
>>>>> "Vector element count mismatch!");
>>>>> - if (OpOpcode == ISD::SIGN_EXTEND || OpOpcode == ISD::ZERO_EXTEND)
>>>>> - return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));
>>>>> - else if (OpOpcode == ISD::UNDEF)
>>>>> + if (OpOpcode == ISD::UNDEF)
>>>>> // sext(undef) = 0, because the top bits will all be the same.
>>>>> return getConstant(0, VT);
>>>>> break;
>>>>
>>>> Likewise, why are you getting rid of this folding?
>>>>
>>>> Ciao, Duncan.
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>> --------------------------------------------------------------------
>>>> -
>>>> Intel Israel (74) Limited
>>>>
>>>> This e-mail and any attachments may contain confidential material
>>>> for the sole use of the intended recipient(s). Any review or
>>>> distribution by others is strictly prohibited. If you are not the
>>>> intended recipient, please contact the sender and delete all copies.
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> Intel Israel (74) Limited
>>>
>>> This e-mail and any attachments may contain confidential material for
>>> the sole use of the intended recipient(s). Any review or distribution
>>> by others is strictly prohibited. If you are not the intended
>>> recipient, please contact the sender and delete all copies.
>>>
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
More information about the llvm-commits
mailing list