[Libclc-dev] [PATCH 0/9] R600 load/store improvements
awatry at gmail.com
Tue Jul 22 18:46:41 PDT 2014
This series attempts to clean up the r600 int32 vload/vstore logic.
After this series is done, we use assembly versions of vload/vstore
for all int32 loads/stores for all address spaces on r600. The only downside
is that there's an RFC patch at the beginning of the series which is
necessary at the moment to prevent the nextafter/sign builtins from
regressing for float16 data types (see the associated piglit tests).
The RFC patch shouldn't have any side-effects, but it seems to paper
over a bug in clang or llvm somewhere. It does seem to result in slight
savings due to some simpler shufflevector operations.
With this series applied, clover for the OSS radeon driver generates
much cleaner llvm bitcode from CL kernels on my machine when
using 32-bit int/float types. In the future, I'll give a shot at
expanding the assembly optimizations to other types such as char,
short, long, double, and half.
More information about the Libclc-dev