patch: make instcombine remove shuffles by reordering vector elements

Duncan Sands duncan.sands at gmail.com
Sun May 5 01:47:29 PDT 2013


Hi Anton,

On 05/05/13 10:22, Anton Korobeynikov wrote:
>>> We lower x86
>>> shuffles with 1000 lines of c++ code.
>>
>> Maybe that's not so bad ;) The PPC has a whole perfect-shuffle generation framework to handle these kinds of things for Altivec. Have you ever looked at PPCPerfectShuffle.h and utils/PerfectShuffle/PerfectShuffle.cpp?
> Same on ARM. But everything is only for 4-element shuffle. Doing same
> for 8 element shuffles looks like an impractical task (both in time
> and memory requirement for shuffle table).
>
> We can "cheat" with some clever "8 el shuffle to 4 el shuffle"
> lowering pass, but I'm not aware of any.
>
> And on x86 we have much wider regs...

how are the perfect shuffle tables generated?  I'm assuming it is done by,
for each shuffle, solving offline an optimization problem where the objective
function is based on known characteristics of the processor.  What are those
characteristics?  Maybe it is possible to solve the optimization problem, or
get a near-to-optimal solution, on the fly with a sufficiently clever algorithm.

Ciao, Duncan.



More information about the llvm-commits mailing list