[Libclc-dev] [PATCH 2/2] R600: improve float vload/vstore path

Aaron Watry awatry at gmail.com
Fri Jul 18 11:35:06 PDT 2014


On Fri, Jul 18, 2014 at 1:12 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>
> On Jul 18, 2014, at 9:04 AM, Aaron Watry <awatry at gmail.com> wrote:
>
> -/*Note: R600 back-end doesn't support load <3 x ?>... so
> +/*Note: R600 back-end doesn't support store <3 x ?>... so
>  * those functions aren't actually overridden here... When the back-end
> supports
>  * that, then clean add here, and remove the vstore3 definitions from above.
>  */
> @@ -100,5 +106,6 @@ _CLC_DECL void
> __clc_vstore16_##LLVM_SCALAR_TYPE##__addr##ADDR_SPACE_ID (PRIM_TY
>   _CLC_VSTORE_ASM_DECL(int,i32,__global,1) \
>   _CLC_VSTORE_ASM_OVERLOAD_ADDR_SPACES(int,int,i32) \
>   _CLC_VSTORE_ASM_OVERLOAD_ADDR_SPACES(uint,int,i32) \
> +  _CLC_VSTORE_ASM_OVERLOAD_ADDR_SPACES(float,int,i32) \
>
>
> What’s wrong with 3 x vectors? They don’t work very well and get split into
> multiple loads currently, but they should work correctly for now

That's good to hear.  When I originally wrote the vload/vstore code,
all I ever got was instruction selection errors for 3-element vectors,
so I left it out of the original version (over a year ago, I think).

For now, I had noticed that there was a word selection error (I had
copy/pasted vload to vstore and not changed the wording here), so I
fixed that while doing the float additions.  I haven't changed any
actual code here with regards to 3-element vectors.

If we want, I can re-visit the feasibility of <3 x i32> load/stores in
both vload/vstore, as I'd love to not have to special case it anymore.
Follow-up patch material?

--Aaron




More information about the Libclc-dev mailing list