[cfe-dev] Vectorizer and buffer overlaps

Fri May 2 09:48:00 PDT 2014

On 05/02/2014 04:04 AM, Nicola Gigante wrote:
> Hello
>
> I feel like this is a trivial question but I can’t find an answer.
>
> I’m looking at the llvm vectorizer to learn how to better
> take advantage of it.
>
> I have this function:
>
> void product(float *data1, float *data2, float *result, size_t size) {
>      size_t i = 0;
>      for(i = 0; i < size; i++) {
>          result[i] = data1[i] * data2[i];
>      }
> }
>
> It seems it correctly get vectorized with -O3, but I can see
> the vector.memcheck block that looks for overlapping buffers.
>
> How can I inform the compiler that I know the buffer won’t overlap?
It sounds like you're looking for the "restrict" keyword in C/C++. I 
haven't looked at how this gets translated into LLVM IR, but you can run 
it through Clang if needed.
>
> Simulating a real case (kind of), I’ve tried something like this:
> struct mass_tag;
> struct acceleration_tag;
> struct force_tag;
>
> template<typename T, typename>
> class wrapper {
>      T _v;
> public:
>      wrapper(T v) : _v(v) { }
>      operator T() const { return _v; }
> };
>
> using mass = wrapper<float, mass_tag>;
> using acceleration = wrapper<float, acceleration_tag>;
> using force = wrapper<float, force_tag>;
>
> void product(mass *data1, acceleration *data2,
>               force *result, size_t size) {
>      size_t i = 0;
>      for(i = 0; i < size; i++) {
>          result[i] = data1[i] * data2[i];
>      }
> }
>
> The concept here reminds something like Boost.Unit.
> I thought that being the pointers of different type,
> type based alias analysis would say that they don’t overlap,
> but I’m missing something since the memcheck block is still
> there. I’m compiling with clang 3.4 with -O3 -fstrict-aliasing, isn’t it enough?
I'm going to leave this part to someone else.  I'm not quite sure of the 
C++ rules here.  I suspect you're suffering from the general "cast 
through void*" problem though.
>
> Even this version has the memcheck block:
> void product(std::array<mass, 32> data1, std::array<acceleration, 32> data2, std::array<force, 32> &result) {
>      size_t i = 0;
>      for(i = 0; i < 32; i++) {
>          result[i] = data1[i] * data2[i];
>      }
> }
>
> Inputs arrays are copied (unrealistic code)... Why does it check for overlaps?
> I feel like I’m missing something obvious.
> So how do I inform the compiler that it doesn’t need che memcheck block?
This sounds like either a) a bug, or b) information lost due to lack of 
inlining.  The new dynamic allocations should be marked noalias. As a 
result, we shouldn't need the check.  Have you looked at the O3 IR to 
see if the constructors of array get inlined?

Philip