[LLVMdev] Making GEP into vector illegal?

Wed Oct 15 13:47:49 PDT 2008

On Oct 15, 2008, at 9:11 AM, Daniel M Gessel wrote:

> Just for reference, my C example was for my own clarification.
>
> I dived into LLVM having to write a TargetMachine and I've been
> keeping busy without having to learn much IR (yet). I was really
> trying to use C as a pseudo-IR.
>
> I get that the idea is allowing IR to directly express the address of
> part of a vector complicates (prevents?) certain optimizations.
>
> However, due to my own ignorance, I don't understand why.

The basic issue is that it makes the optimizer think that it is a good  
thing to do.  Here is a silly example:

   myfloat4 X = ...
   float Y = *(float4*)&X;

This is an idiom that people commonly use with GCC, because it doesn't  
support syntax like "X[0]".  In this case, the optimizer can sometimes  
see the cast to float as being an access to the first element of the  
vector.  Based on phase ordering, it can rewrite the bitcast of the  
address into a "gep &X, 0, 0" followed by a load.  This access makes  
it harder for later optimizers to understand what is going on (we want  
the optimizer to "raise" this to a single extractelement operation  
from X).

> My first thought was that a GEP to part of a vector shouldn't really
> pose any more complications (restrictions?) than a GEP of the vector
> as a whole. That's what I was trying to get at with my example.

IR design is an art, not a science.  If you ignored implementation  
complexity and engineering issues, all solutions would be equally  
good.  Because we do care about the implementation difficulty and the  
ability to make things work without huge amounts of effort, we try to  
design the IR to be simple and orthogonal where possible.  For  
example, if we continue to allow GEPs to index into vector elements,  
we have to update all the optimizers that might (accidentally!) do  
this, and give them a simple cost model that says not to!

In my opinion, this is a case that we already model well with gep +  
bitcast + gep, and being able to represent these cases as a single gep  
doesn't give us any benefit over doing it.

-Chris