[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

Mon Dec 5 10:02:32 PST 2011

----- Original Message -----
> Jose Fonseca <jfonseca at vmware.com> writes:
> 
> > ----- Original Message -----
> >> "Rotem, Nadav" <nadav.rotem at intel.com> writes:
> >> 
> >> > David,
> >> >
> >> > Thanks for the support! I sent a detailed email with the overall
> >> > plan. But just to reiterate, the GEP would look like this:
> >> >
> >> > 	%PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2,
> >> > 	i32
> >> > 	3, i32 4>
> >> >
> >> > Where the index of the GEP is a vector of indices. I am not
> >> > against
> >> > having multiple indices. I just want to start with a basic set
> >> > of
> >> > features.
> >> 
> >> Ah, I see.  I actually think multiple indices as in multiple
> >> vectors
> >> of
> >> indices to the GEP above would be pretty rare.
> >
> > Nadav, David,
> >
> > I'd like to understand a bit better the final role of these pointer
> > vector types in 64bit architectures, where the pointers are often
> > bigger than the elements stored/fetch (e.g, 32bits floats/ints).
> 
> The pointers are addresses.  On a 64-bit address machine they will be
> 64
> bits.  On a 32-bit address machine they will be 32 bits.
> 
> For a situation like PTX that has multiple addresses sizes, we will
> need
> additional LLVM support.  Right now a pointer can only have one size
> per
> target.
> 
> > Will 64bits backends be forced to actually operate with 64bit
> > pointer
> > vectors all the time? Or will they be able to retain operations on
> > base + 32bit offsets as such?
> 
> Are you talking about 32-bit pointers?  If so, Nadav has talked about
> vector inttoptr and ptrtoint instructions which I think can address
> the
> need you're getting at.  But I'm a little unclear on what you want.
> 
> > In particular, an important use case for 3D software rendering is
> > to
> > be able to gather <4 x i32> values, from a i32* scalar base pointer
> > in
> > a 64bit address space, indexed by <N x i32> offsets. [1] And it is
> > important that the intermediate <N x i32*> pointer vectors is
> > actually
> > never instanced, as it wouldn't fit in the hardware SIMD registers,
> > and therefore would require two gather operations.
> 
> By "fit" are you worried about vector length?  If so, legalize would
> have to break up the <N x i32*> vector into two or more smaller
> vectors.
> 
> If you are worried about element size (there are only 32-bit
> elements)
> then inttoptr/ptrtoint should handle it, I think.

I was referring to gathering a vector of sparse 32bit words, all relative from a base scalar pointer in a 64bit address space, where the offsets are in a 32bit integer vector.  My other reply gave a more detailed and concrete example.

Anyway, from Nadav's and your other replies on this thread it is now clear to me that even if the IR doesn't express base scalar pointers w/ vector indices directly, the backend can always match and emit the most efficient machine instruction. This addresses my main concern.

Jose