[PATCH] D17041: [X86] Don't assume that a shuffle operand is #0: it isn't for VPERMV.
Ahmed Bougacha via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 9 14:13:55 PST 2016
ab created this revision.
ab added reviewers: RKSimon, spatel.
ab added a subscriber: llvm-commits.
Since:
r246981 AVX-512: Lowering for 512-bit vector shuffles.
VPERMV is recognized in getTargetShuffleMask.
This breaks assumptions in most callers, as they expect N->getOperand(0) to be (one of) the vector operand(s). It isn't, as VPERMV has the mask as operand #0 (I can't think of another shuffle-like instruction that works the same).
In the added testcase, this leads the funny-looking:
vmovdqa .LCPI0_0(%rip), %ymm0 # ymm0 = [0,1,2,3,4,5,6,4]
vpshufb .LCPI0_1(%rip), %ymm0, %ymm0 # ymm0 = ymm0[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,16,17,18,18]
In my original testcase (s/i32 4>)/i32 1>)/ should do the trick) , the VPSHUFB lane restriction was another problem, but Simon fixed that in r260063.
I can think of two obvious solutions:
- swap the X86ISD::VPERMV operands, commenting in X86ISelLowering.h that it's different from the instructions. IMO, it's confusing either way.
- return the operands and fix the users. There are many users, some of which (e.g., setTargetShuffleZeroElements) only return a mask themselves. This doesn't seem perfect either.
This (very rough, WIP) patch implements the latter.
What do you think? We might improve this by having a struct wrap <Mask, IsUnary, Ops>, and hopefully avoid computing slightly different things in different places.
http://reviews.llvm.org/D17041
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/avx2-vperm-combining.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D17041.47363.patch
Type: text/x-patch
Size: 15641 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160209/6ae0662f/attachment.bin>
More information about the llvm-commits
mailing list