[PATCH] D22822: Adjust coercion of aggregates on RenderScript

Pirama Arumuga Nainar via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 26 14:51:29 PDT 2016


pirama added a comment.

In https://reviews.llvm.org/D22822#496636, @t.p.northover wrote:

> Are you aware of how inefficient the resulting ABI is once it hits CodeGen? For example using `[3 x i8]` will waste 3 full 64-bit registers for that struct. `struct { char arr[16] };` barely avoids crashing the backend it's so bad (LLVM demotes it to sret at the last moment).
>
> I suppose the real question is why do RenderScript passes need the types to have the same size, and have you really considered all other options? This ABI mangling would be an absolute last resort if I was trying to add support.


The size requirement is imposed by the RenderScript runtime.  To run a compute kernel, the RenderScript runtime iterates over its input and output buffers using a stride equal to the size of the underlying type and invokes the kernel function on entries in the buffer.  Disagreement between the kernel and the the runtime on the sizes of these types can lead to incorrect output,

An alternative we considered is for the runtime to duplicate this coercion logic but that'd be complex to implement in the runtime and harder to maintain.


================
Comment at: test/CodeGen/renderscript.c:138-139
@@ +137,4 @@
+
+// CHECK-RS32: void @retLong9(%struct.sLong9*
+// CHECK-RS64: void @retLong9(%struct.sLong9*
+sLong9 retLong9() { sLong9 r; return r; }
----------------
t.p.northover wrote:
> Shouldn't these be sret? (And above).
Yes, these should be sret.  I was lazy and did not check the whole signature.  I'll upload an update shortly.


https://reviews.llvm.org/D22822





More information about the llvm-commits mailing list