[PATCH][AVX512] Generate masking instruction variants with tablegen

Adam Nemet anemet at apple.com
Thu Aug 7 11:04:26 PDT 2014


Hi Robert,

On Aug 7, 2014, at 4:04 AM, Robert Khasanov <rob.khasanov at gmail.com> wrote:

> Hi Adam, 
> 
> 
> I have some notes to your change.
> 
> * Some instructions have masking versions, but don't have zero-masking (e.g. stores and other write memory instructions). I think there are two different ways to take into account: 
>    1) make separate multiclasses for (r +rk + rkz) and (r+rk); 
>    2) Propose new feature to TableGen to create conditional def/defm in multiclasses. This feature will be suitable also for embedded broadcasts: only instructions w/ 32 and 64 bit element size have embedded broadcast feature. However, I don't know how it can look in syntax, I have just an idea and I'm not sure the community will accept this.

I have been successful using multiclass inheritance for this.  Take a look at avx512_perm_table_3src for example which derives from avx512_perm_3src only to add the intrinsics (the idea being that for ).  Sounds like the same approach would work for the cases you mention.

> * There are some instructions that have not lowering patterns from LLVM IR instructions. I suggest to use !if(x,[(set RC:$dst, RHS)],[]) where x is some condition or bit that is passed from multiclass arg. 

Agreed.

> You can fix last item in this patch or later in separate patch. 

I will do that when adding an instance of such instruction.

> LGTM

Thanks for the review!  Committed as r215125.

Adam

> 
> 
> 
> 2014-08-07 9:28 GMT+04:00 Adam Nemet <anemet at apple.com>:
> Hi,
> 
> After adding the masking variants to several instructions, I have decided to
> experiment with generating these from the non-masking/unconditional
> variant. This will hopefully reduce the amount repetition that we currently
> have in order to define an instruction with all its variants (for a reg/mem
> instruction this would be 6 instruction defs and 2 Pat<> for the intrinsic).
> 
> The patch is the first cut that is currently only applied to valignd/q to keep
> the patch small.
> 
> A few notes on the approach:
> 
>   * In order to stitch together the dag for both the conditional and the
>   unconditional patterns I pass the RHS of the set rather than the full
>   pattern (set dest, RHS).
>   * Rather than subclassing each instruction base class (e.g. AVX512AIi8),
>   with a masking variant which wouldn't scale, I derived the masking
>   instructions from a new base class AVX512 (this is just I<> with
>   Requires<HasAVX512>).  The instructions derive from this now, plus a new set
>   of classes that add the format bits and everything else that instruction
>   base class provided (i.e. AVX512AIi8 vs. AVX512AIi8Base).
> 
> I hope we can go incrementally from here.  I expect that:
> 
>   * We will need different variants of the masking class.  One example is
>   instructions requiring three vector sources.  In this case we tie one of the
>   source operands to dest rather than a new implicit source operand ($src0)
>   * Add the zero-masking variant
>   * Add more AVX512*Base classes as new uses are added
> 
> I've looked at X86.td.expanded before and after to make sure that nothing got
> lost for valignd/q.
> 
> Please let me know if it looks good.
> 
> Adam
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140807/839920b1/attachment.html>


More information about the llvm-commits mailing list