[Libclc-dev] [PATCH 2/2] R600: improve float vload/vstore path
awatry at gmail.com
Fri Jul 18 11:35:06 PDT 2014
On Fri, Jul 18, 2014 at 1:12 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
> On Jul 18, 2014, at 9:04 AM, Aaron Watry <awatry at gmail.com> wrote:
> -/*Note: R600 back-end doesn't support load <3 x ?>... so
> +/*Note: R600 back-end doesn't support store <3 x ?>... so
> * those functions aren't actually overridden here... When the back-end
> * that, then clean add here, and remove the vstore3 definitions from above.
> @@ -100,5 +106,6 @@ _CLC_DECL void
> __clc_vstore16_##LLVM_SCALAR_TYPE##__addr##ADDR_SPACE_ID (PRIM_TY
> _CLC_VSTORE_ASM_DECL(int,i32,__global,1) \
> _CLC_VSTORE_ASM_OVERLOAD_ADDR_SPACES(int,int,i32) \
> _CLC_VSTORE_ASM_OVERLOAD_ADDR_SPACES(uint,int,i32) \
> + _CLC_VSTORE_ASM_OVERLOAD_ADDR_SPACES(float,int,i32) \
> What’s wrong with 3 x vectors? They don’t work very well and get split into
> multiple loads currently, but they should work correctly for now
That's good to hear. When I originally wrote the vload/vstore code,
all I ever got was instruction selection errors for 3-element vectors,
so I left it out of the original version (over a year ago, I think).
For now, I had noticed that there was a word selection error (I had
copy/pasted vload to vstore and not changed the wording here), so I
fixed that while doing the float additions. I haven't changed any
actual code here with regards to 3-element vectors.
If we want, I can re-visit the feasibility of <3 x i32> load/stores in
both vload/vstore, as I'd love to not have to special case it anymore.
Follow-up patch material?
More information about the Libclc-dev