[PATCH][AVX512] Generate masking instruction variants with tablegen
Adam Nemet
anemet at apple.com
Thu Aug 7 11:04:26 PDT 2014
Hi Robert,
On Aug 7, 2014, at 4:04 AM, Robert Khasanov <rob.khasanov at gmail.com> wrote:
> Hi Adam,
>
>
> I have some notes to your change.
>
> * Some instructions have masking versions, but don't have zero-masking (e.g. stores and other write memory instructions). I think there are two different ways to take into account:
> 1) make separate multiclasses for (r +rk + rkz) and (r+rk);
> 2) Propose new feature to TableGen to create conditional def/defm in multiclasses. This feature will be suitable also for embedded broadcasts: only instructions w/ 32 and 64 bit element size have embedded broadcast feature. However, I don't know how it can look in syntax, I have just an idea and I'm not sure the community will accept this.
I have been successful using multiclass inheritance for this. Take a look at avx512_perm_table_3src for example which derives from avx512_perm_3src only to add the intrinsics (the idea being that for ). Sounds like the same approach would work for the cases you mention.
> * There are some instructions that have not lowering patterns from LLVM IR instructions. I suggest to use !if(x,[(set RC:$dst, RHS)],[]) where x is some condition or bit that is passed from multiclass arg.
Agreed.
> You can fix last item in this patch or later in separate patch.
I will do that when adding an instance of such instruction.
> LGTM
Thanks for the review! Committed as r215125.
Adam
>
>
>
> 2014-08-07 9:28 GMT+04:00 Adam Nemet <anemet at apple.com>:
> Hi,
>
> After adding the masking variants to several instructions, I have decided to
> experiment with generating these from the non-masking/unconditional
> variant. This will hopefully reduce the amount repetition that we currently
> have in order to define an instruction with all its variants (for a reg/mem
> instruction this would be 6 instruction defs and 2 Pat<> for the intrinsic).
>
> The patch is the first cut that is currently only applied to valignd/q to keep
> the patch small.
>
> A few notes on the approach:
>
> * In order to stitch together the dag for both the conditional and the
> unconditional patterns I pass the RHS of the set rather than the full
> pattern (set dest, RHS).
> * Rather than subclassing each instruction base class (e.g. AVX512AIi8),
> with a masking variant which wouldn't scale, I derived the masking
> instructions from a new base class AVX512 (this is just I<> with
> Requires<HasAVX512>). The instructions derive from this now, plus a new set
> of classes that add the format bits and everything else that instruction
> base class provided (i.e. AVX512AIi8 vs. AVX512AIi8Base).
>
> I hope we can go incrementally from here. I expect that:
>
> * We will need different variants of the masking class. One example is
> instructions requiring three vector sources. In this case we tie one of the
> source operands to dest rather than a new implicit source operand ($src0)
> * Add the zero-masking variant
> * Add more AVX512*Base classes as new uses are added
>
> I've looked at X86.td.expanded before and after to make sure that nothing got
> lost for valignd/q.
>
> Please let me know if it looks good.
>
> Adam
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140807/839920b1/attachment.html>
More information about the llvm-commits
mailing list