[llvm] r250027 - [x86] PR24562: fix incorrect folding of PSHUFB nodes with a mask where all indices have the most significant bit set.

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 15 02:02:41 PDT 2015


Hi Bruno,

On Thu, Oct 15, 2015 at 12:43 AM, Bruno Cardoso Lopes via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Hi Andrea,
>
> Nice catch!
>
> On Mon, Oct 12, 2015 at 4:25 AM, Andrea Di Biagio via llvm-commits
> <llvm-commits at lists.llvm.org> wrote:
> > Author: adibiagio
> > Date: Mon Oct 12 06:25:41 2015
> > New Revision: 250027
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=250027&view=rev
> > Log:
> > [x86] PR24562: fix incorrect folding of PSHUFB nodes with a mask where
> all indices have the most significant bit set.
> >
> > This patch fixes a problem in function 'combineX86ShuffleChain' that
> causes a
> > chain of shuffles to be wrongly folded away when the combined shuffle
> mask has
> > only one element.
> >
> > We may end up with a combined shuffle mask of one element as a result of
> > multiple calls to function 'canWidenShuffleElements()'.
> > Function canWidenShuffleElements attempts to simplify a shuffle mask by
> widening
> > the size of the elements being shuffled.
> > For every pair of shuffle indices, function canWidenShuffleElements
> checks if
> > indices refer to adjacent elements. If all pairs refer to "adjacent"
> elements
> > then the shuffle mask is safely widened. As a consequence of widening,
> we end up
> > with a new shuffle mask which is half the size of the original shuffle
> mask.
> >
> > The byte shuffle (pshufb) from test pr24562.ll has a mask of all
> SM_SentinelZero
> > indices. Function canWidenShuffleElements would combine each pair of
> > SM_SentinelZero indices into a single SM_SentinelZero index. So, in a
> > logarithmic number of steps (4 in this case), the pshufb mask is
> simplified to
> > a mask with only one index which is equal to SM_SentinelZero.
> >
> > Before this patch, function combineX86ShuffleChain wrongly assumed that
> a mask
> > of size one is always equivalent to an identity mask. So, the entire
> shuffle
> > chain was just folded away as the combined shuffle mask was treated as a
> no-op
> > mask.
> >
> > With this patch we know check if the only element of a combined shuffle
> mask is
> > SM_SentinelZero. In case, we propagate a zero vector.
> >
> > Differential Revision: http://reviews.llvm.org/D13364
> >
> > Added:
> >     llvm/trunk/test/CodeGen/X86/pr24562.ll
>
> Any reason why not to add this test to the
> appropriate-already-existent shuffle test file instead?
>

Hi Bruno,

There is no particular reason to be honest.
Probably, the most 'appropriate-already-existent' test for pshufb would be
'x86-fold-pshufb.ll'. The other existing test is 'ssse3-intrinsics-x86.ll'
but I personally tink that x86-fold-pshufb.ll is more appropriate. I'll
move the test there if okay for you.

Cheers,
Andrea


>
> Thanks,
>
> --
> Bruno Cardoso Lopes
> http://www.brunocardoso.cc
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151015/2f6e7021/attachment.html>


More information about the llvm-commits mailing list