[PATCH] D113214: [IR][ShuffleVector] Introduce `isReplicationMask()` matcher

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 4 14:21:33 PDT 2021


lebedev.ri created this revision.
lebedev.ri added reviewers: spatel, RKSimon.
lebedev.ri added a project: LLVM.
Herald added subscribers: dexonsmith, pengfei, hiraditya.
lebedev.ri requested review of this revision.

Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.

As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.

As it has been witnessed in the recent AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.

Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching the `TargetTransformInfoImplCRTPBase::getUserCost()`
to recognize `Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`

This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.

This patch adds precise undef-aware `isReplicationMask()`, with exhaustive test coverage.
`InstructionsTest.ShuffleMaskIsReplicationMask` shows that it correctly detects all the known masks.
`InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness` shows that if 
we detected the replication mask with given params, then if we actually generate
a true replication mask with said params, it matches element-wise ignoring undef mask elements.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D113214

Files:
  llvm/include/llvm/IR/Instructions.h
  llvm/lib/IR/Instructions.cpp
  llvm/unittests/IR/InstructionsTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D113214.384864.patch
Type: text/x-patch
Size: 7214 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211104/06de151d/attachment.bin>


More information about the llvm-commits mailing list