[PATCH][X86] AVX512: Add writemask variants for vperm*2*

Adam Nemet anemet at apple.com
Wed Jul 2 14:38:47 PDT 2014


On Jul 2, 2014, at 2:00 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote:

> Hi Adam,
> 
> Could you, please, add one more test for < 8 x i64 > type for VPERM2Q?
> 
> I just remember that (v8i64 immAllZerosV) did not work and I used 
> (bc_v8i64 (v16i32 immAllZerosV)). 

Hi Elena,

Sure but my worry was that that wouldn’t match for v16i32 since there is no covert in that case.  That node is simply (v16i32 immAllZerosV).

However as it turns out the the ISel generator is intelligent enough to remove the bitconvert in this case.  

For v8i64, it is matched with BITCAST:

/*127982*/            OPC_RecordChild2, // #3 = $src3
/*127983*/            OPC_MoveParent,
/*127984*/            OPC_MoveChild, 2,
/*127986*/            OPC_CheckOpcode, TARGET_VAL(ISD::BITCAST),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/*127989*/            OPC_MoveChild, 0,
/*127991*/            OPC_CheckOpcode, TARGET_VAL(ISD::BUILD_VECTOR),
/*127994*/            OPC_CheckPredicate, 34, // Predicate_immAllZerosV
/*127996*/            OPC_CheckType, MVT::v16i32,
/*127998*/            OPC_MoveParent,
/*127999*/            OPC_MoveParent,
/*128000*/            OPC_CheckType, MVT::v8i64,
/*128002*/            OPC_CheckPatternPredicate, 4, // (Subtarget->hasAVX512())
/*128004*/            OPC_MorphNodeTo, TARGET_VAL(X86::VPERMT2Qrrkz), 0,

Whereas for v16i32 without:

/*124212*/            OPC_RecordChild2, // #3 = $src3
/*124213*/            OPC_MoveParent,
/*124214*/            OPC_MoveChild, 2,
/*124216*/            OPC_CheckOpcode, TARGET_VAL(ISD::BUILD_VECTOR),
/*124219*/            OPC_CheckPredicate, 34, // Predicate_immAllZerosV
/*124221*/            OPC_MoveParent,
/*124222*/            OPC_CheckType, MVT::v16i32,
/*124224*/            OPC_CheckPatternPredicate, 4, // (Subtarget->hasAVX512())
/*124226*/            OPC_MorphNodeTo, TARGET_VAL(X86::VPERMT2Drrkz), 0,

Very cool.  So the “canonical” way to match this is (OpVT (bitconvert (v16i32 immAllZerosV))).

So I removed the FIXME and beefed up the testing.  Committed as r212221, r212222 and r212223.

Thanks for the review!

Adam


> 
> All other things are OK.
> 
> -  Elena
> 
> 
> -----Original Message-----
> From: Adam Nemet [mailto:anemet at apple.com] 
> Sent: Wednesday, July 02, 2014 10:35
> To: Demikhovsky, Elena
> Cc: llvm-commits
> Subject: [PATCH][X86] AVX512: Add writemask variants for vperm*2*
> 
> Hi Elena,
> 
> This enables writemasks with vperm*2* in asm, codegen and the intrinsics.  There are more comments in the patch files.  Please let me know if it looks good.
> 
> Thanks,
> Adam
> 
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 





More information about the llvm-commits mailing list