[LLVMdev] Extending vector operations

Mon Jul 21 13:21:50 PDT 2008

Hi,

We would like to extend the vector operations in llvm a bit. We're  
hoping to get some feedback on the right way to go, or some starting  
points. I had previously had some discussion on this list about a  
subset of the changes we have in mind.

All of these changes are intended to make target-independent IR (i.e.  
IR without machine specific intrinsics) generate better code or be  
easier to generate from a frontend with vector support (whether from  
manual or autovectorization).

If you have any insight into how to best get started with any of these  
changes, and whether they are feasible and sensible, please let me  
know. We're mostly interested in x86 as a target in the short term,  
but obviously want these to apply to other LLVM targets as well. We're  
prepared to implement these changes, but would like to hear any  
suggestions and objections you might have.

Below are the specific additions we have in mind.

===
1) Vector shl, lshr, ashr

I think these are no-brainers. We would like to extend the semantics  
of the shifting instructions to naturally apply to vectors as well.  
One issue is that these operations often only support a single shift  
amount for an entire vector. I assume it should be fairly  
straightforward to select on this pattern, and scalarize the general  
case as necessary.

2) Vector strunc, sext, zext, fptrunc and fpext

Again, I think these are hopefully straightforward. Please let me know  
if you expect any issues with vector operations that change element  
sizes from the RHS to the LHS, e.g. around legalization.

3) Vector intrinsics for floor, ceil, round, frac/modf

These are operations that are not trivially specified in terms of  
simpler operations. It would be nice to have these as overloaded,  
target-independent intrinsics, in the same way as llvm.cos etc. are  
supported now.

4) Vector select

We consider a vector select extremely important for a number of  
operations. This would be an extension of select to support an <N x  
i1> vector mask to select between elements of <N x T> vectors for some  
basic type T. Vector min, max, sign, etc. can be built on top of this  
operation.

5) Vector comparisons that return <N x i1>

This is maybe not a must-have, and perhaps more a question of  
preference. I understand the current vfcmp/vicmp semantics, returning  
a vector of iK where K matches the bitwidth of the operands being  
compared with the high bit set or not, are there for pragmatic  
reasons, and that these functions exist to aid with code emitted that  
uses machine-specific intrinsics.

For code that does not use machine intrinsics, I believe it would be  
cleaner, simpler, and potentially more efficient, to have a vector  
compare that returns <N x i1> instead. For example, in conjunction  
with the above-mentioned vector select, this would allow a max to be  
expressed simply as a sequence of compare and select.

Vector bitshifts would actually help with the amount of code generated  
for something like a vectorized max, but makes the patterns for  
recognizing these a lot longer.

I realize this is probably the most controversial change amongst  
these. I gather there is some concern about representing "variable  
width" i1s, but I would contend that that's the case even for i1s  
which are not vectors.
===

In addition to the above suggestions, I'd also like to hear what  
others think about handling vector operations that aren't powers of  
two in size, e.g. <3 x float> operations. I gather the status quo is  
that only POT sizes are expected to work (although we've found some  
bugs for things like <2 x float> that we're submitting). Ideally  
things like <3 x float> operands would usually be rounded up to the  
size supported by the machine directly. We can try to do this in the  
frontend, but it would of course be ideal if these just worked. I'm  
curious if anyone else out there has dealt with this already and has  
some suggestions.

Please let me know what you think,

Stefanus

--
Stefanus Du Toit <stefanus.dutoit at rapidmind.com>
   RapidMind Inc.
   phone: +1 519 885 5455 x116 -- fax: +1 519 885 1463