[llvm-commits] Optimization for SIGN_EXTEND on AVX and AVX2 - please review
Demikhovsky, Elena
elena.demikhovsky at intel.com
Tue Jan 15 05:09:14 PST 2013
> X86 turns it into v8i8 -> v8i16 ->v8i32 ->v8i64
DAGCombiner takes the last v8i32 ->v8i64, combines with v8i16 ->v8i32. Than it sees a new node again. Takes v8i16 -> v8i64 and combines with v8i8 -> v8i16.
And I have an endless loop that splits and joins SEXT nodes.
I looked for a "delimiter" node that can be inserted between nodes will disappear at the end, but I did not find such.
- Elena
-----Original Message-----
From: Duncan Sands [mailto:duncan.sands at gmail.com] On Behalf Of Duncan Sands
Sent: Tuesday, January 15, 2013 14:49
To: Demikhovsky, Elena
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and AVX2 - please review
Hi Elena,
On 15/01/13 13:42, Demikhovsky, Elena wrote:
>> why don't you have the X86 target turn this immediately into a VPMOVSXBW, VPMOVSXWD, VPMOVSXDQ combination, rather than going through all of these intermediate steps?
>
> I already have the code that does the work in LowerSIGN_EXTEND(). It is a little bit more than just VPMOVSX.
> Combining load with "sext v8i8 -> v8i16" happens in the common DAGCombine. I don't want to duplicate it too.
OK, I misunderstood the role of the load at first.
>
>> I don't like turning off generally useful transforms like sext(sext)->sext.
> I don't turn it off, just do it after type legalizer.
I still didn't understand why this is needed. I'm imagining this: the DAG combiner runs, and calls the X86 combiner on v8i8 - > v8i64. X86 turns it into v8i8 -> v8i16 ->v8i32 ->v8i64. As the DAG/X86 combiner runs until there is no more work to do, the DAG combiner should then combine the load with the v8i8
-> v8i16 and so on until you only have the target instructions you
-> described
(VPMOVSXBW etc). So no-one outside of the DAG combiner, in particular the type legalizer, should ever see the intermediate v8i8 -> v8i16 ->v8i32 ->v8i64 form.
Is this not how it happens?
Ciao, Duncan.
>
> - Elena
>
> -----Original Message-----
> From: Duncan Sands [mailto:duncan.sands at gmail.com] On Behalf Of Duncan
> Sands
> Sent: Tuesday, January 15, 2013 14:26
> To: Demikhovsky, Elena
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and
> AVX2 - please review
>
> Hi Elena,
>
> On 15/01/13 13:21, Demikhovsky, Elena wrote:
>> I put 2 or 3 sequential sign-extend operations.
>>
>> original: v8i8 - > v8i64
>
> why don't you have the X86 target turn this immediately into a VPMOVSXBW, VPMOVSXWD, VPMOVSXDQ combination, rather than going through all of these intermediate steps?
>
> I don't like turning off generally useful transforms like sext(sext)->sext.
>
> Ciao, Duncan.
>
>> new: v8i8 -> v8i16 ->v8i32 ->v8i64
>>
>> I combine load with v8i8 -> v8i16 and use VPMOVSXBW instruction.
>> I translate v8i16 ->v8i32 to 2 VPMOVSXWD instructions.
>> And then I have 4 VPMOVSXDQ for v8i32 ->v8i64.
>>
>> if I work with one SIGN_EXTEND operation with illegal types, I receive a scalar code after type legalizer.
>>
>> - Elena
>>
>> -----Original Message-----
>> From: llvm-commits-bounces at cs.uiuc.edu
>> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
>> Sent: Tuesday, January 15, 2013 13:58
>> To: llvm-commits at cs.uiuc.edu
>> Subject: Re: [llvm-commits] Optimization for SIGN_EXTEND on AVX and
>> AVX2 - please review
>>
>> Hi Elena,
>>
>> On 15/01/13 12:47, Demikhovsky, Elena wrote:
>>> I optimized the following SEXT pairs:
>>> v8i8 -> v8i64
>>> v8i8->v8i32
>>> v4i8 ->v4i64
>>> v4i16->v4i64
>>
>>> Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>> ===================================================================
>>> --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 172422)
>>> +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy)
>>> @@ -4298,12 +4298,13 @@
>>> if (isa<ConstantSDNode>(N0))
>>> return DAG.getNode(ISD::SIGN_EXTEND, N->getDebugLoc(), VT,
>>> N0);
>>>
>>> - // fold (sext (sext x)) -> (sext x)
>>> - // fold (sext (aext x)) -> (sext x)
>>> - if (N0.getOpcode() == ISD::SIGN_EXTEND || N0.getOpcode() == ISD::ANY_EXTEND)
>>> - return DAG.getNode(ISD::SIGN_EXTEND, N->getDebugLoc(), VT,
>>> - N0.getOperand(0));
>>> -
>>> + if (Level >= AfterLegalizeTypes) {
>>> + // fold (sext (sext x)) -> (sext x)
>>> + // fold (sext (aext x)) -> (sext x)
>>> + if (N0.getOpcode() == ISD::SIGN_EXTEND || N0.getOpcode() == ISD::ANY_EXTEND)
>>> + return DAG.getNode(ISD::SIGN_EXTEND, N->getDebugLoc(), VT,
>>> + N0.getOperand(0)); }
>>
>> why is this needed? As the transform doesn't introduce any illegal types that weren't there already it should be safe to perform at any level.
>>
>> If you are worried that it might create an illegal operation, then that is what you should check for, not the Level.
>>
>>> if (N0.getOpcode() == ISD::TRUNCATE) {
>>> // fold (sext (truncate (load x))) -> (sext (smaller load x))
>>> // fold (sext (truncate (srl (load x), c))) -> (sext (smaller
>>> load (x+c/n)))
>>
>> ...
>>
>>> --- lib/CodeGen/SelectionDAG/SelectionDAG.cpp (revision 172422)
>>> +++ lib/CodeGen/SelectionDAG/SelectionDAG.cpp (working copy)
>>> @@ -2554,9 +2554,7 @@
>>> VT.getVectorNumElements() ==
>>> Operand.getValueType().getVectorNumElements()) &&
>>> "Vector element count mismatch!");
>>> - if (OpOpcode == ISD::SIGN_EXTEND || OpOpcode == ISD::ZERO_EXTEND)
>>> - return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));
>>> - else if (OpOpcode == ISD::UNDEF)
>>> + if (OpOpcode == ISD::UNDEF)
>>> // sext(undef) = 0, because the top bits will all be the same.
>>> return getConstant(0, VT);
>>> break;
>>
>> Likewise, why are you getting rid of this folding?
>>
>> Ciao, Duncan.
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the llvm-commits
mailing list