[PATCH] X86: sink splat-shuffle into block doing a shift.
Tim Northover
t.p.northover at gmail.com
Mon Feb 17 09:19:14 PST 2014
Hi Nadav,
> Thanks for working on this. I think that your approach makes sense. Did you look at the AVX512 instruction set?
I hadn't, but I've had a quick look now and it doesn't seem to differ from AVX2 in the beneficial lengths.
> bool X86TargetLowering::isVariableVectorShiftExpensive(Type *Ty) const {
> if (Subtarget->hasInt256() && (Bits == 32 || Bits == 64))
> return false;
>
> return true;
> }
Doesn't that have different behaviour? It would sink for i8, for example. I can see not wanting to complicate it for pre-SSE CPUs, but I'm reluctant to duplicate the broadcast for 8-bit types.
> This can be extracted into a helper function - isBroadcast / isSplat.
Sounds like a good idea.
> Do we really need this check? Won’t SelectionDAG CSE it for us?
> Won’t SDAG do it for us?
I suspect it would be able to get both of those (a dead shuffle at source and duplicated shuffles at dest), but the CodeGenPrepare convention seems to be to try and tidy up after yourself.
How does this version look? I've (partially) simplified the X86TargetLowering function and extracted the isBroadcastShuffle function, but left the tidying.
Tim.
http://llvm-reviews.chandlerc.com/D2816
CHANGE SINCE LAST DIFF
http://llvm-reviews.chandlerc.com/D2816?vs=7162&id=7164#toc
Files:
include/llvm/Target/TargetLowering.h
lib/Target/X86/X86ISelLowering.cpp
lib/Target/X86/X86ISelLowering.h
lib/Transforms/Scalar/CodeGenPrepare.cpp
test/Transforms/CodeGenPrepare/x86-shuffle-sink.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D2816.2.patch
Type: text/x-patch
Size: 9224 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140217/ea9368aa/attachment.bin>
More information about the llvm-commits
mailing list