[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

Tue Nov 29 13:48:46 PST 2011

Hi Jose, 

The proposed IR change does not contribute nor hinder the usecase you mentioned. The case of a base + vector-index should be easily addressed by an intrinsic. The pointer-vector proposal comes to support full scatter/gather instructions (such as the AVX2 gather instructions).

Nadav

-----Original Message-----
From: Jose Fonseca [mailto:jfonseca at vmware.com] 
Sent: Tuesday, November 29, 2011 22:25
To: Rotem, Nadav; David A. Greene
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

----- Original Message -----
> "Rotem, Nadav" <nadav.rotem at intel.com> writes:
> 
> > David,
> >
> > Thanks for the support! I sent a detailed email with the overall
> > plan. But just to reiterate, the GEP would look like this:
> >
> > 	%PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, i32
> > 	3, i32 4>
> >
> > Where the index of the GEP is a vector of indices. I am not against
> > having multiple indices. I just want to start with a basic set of
> > features.
> 
> Ah, I see.  I actually think multiple indices as in multiple vectors
> of
> indices to the GEP above would be pretty rare.

Nadav, David, 

I'd like to understand a bit better the final role of these pointer vector types in 64bit architectures, where the pointers are often bigger than the elements stored/fetch (e.g, 32bits floats/ints).

Will 64bits backends be forced to actually operate with 64bit pointer vectors all the time? Or will they be able to retain operations on base + 32bit offsets as such?

In particular, an important use case for 3D software rendering is to be able to gather <4 x i32> values, from a i32* scalar base pointer in a 64bit address space, indexed by <N x i32> offsets. [1]  And it is important that the intermediate <N x i32*> pointer vectors is actually never instanced, as it wouldn't fit in the hardware SIMD registers, and therefore would require two gather operations.

It would be nice to see how this use case would look in the proposed IR, and get assurance that backends will be able to emit efficient code (i.e., a single gather instruction) from that IR.

Jose

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-June/040825.html
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.