[llvm-dev] Loop vectorization and unsafe floating point math

Björn Pettersson A via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 24 08:21:32 PDT 2020


Hi llvm-dev!

We are doing some fuzzy testing using C program generators,
and one question that came up when generating a program with
both floating point arithmetic and loop pragmas was;
Is the loop vectorizer really allowed to vectorize a loop when
it can't prove that it is safe to reorder fp math, even if
there is a loop pragma that hints about a preferred width.


When reading here

  http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations

it says " Loop hints can be specified before any loop and
will be ignored if the optimization is not safe to apply.".


But given this example (see also https://godbolt.org/z/fzRHsp )

//------------------------------------------------------------------
//
//  clang -O3 -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize

#include <stdio.h>
#include <stdint.h>

double v_1 = -902.30847021;
double v_2 = -902.30847021;

int main()
{

  #pragma clang loop vectorize_width(2) unroll(disable)
  for (int i = 0; i < 16; ++i) {
    v_1 = v_1 * 430.33975544;
  }

  #pragma clang loop unroll(disable)
  for (int i = 0; i < 16; ++i) {
    v_2 = v_2 * 430.33975544;
  }

  printf("v_1: %f\n", v_1);
  printf("v_2: %f\n", v_2);
}

//
//------------------------------------------------------------------


we get these remarks:

  <source>:11:3: remark: the cost-model indicates that interleaving is not beneficial [-Rpass-analysis=loop-vectorize]
  <source>:11:3: remark: vectorized loop (vectorization width: 2, interleaved count: 1) [-Rpass=loop-vectorize]
  <source>:17:15: remark: loop not vectorized: cannot prove it is safe to reorder floating-point operations; allow reordering by specifying '#pragma clang loop vectorize(enable)' 

and the result:

  v_1: -1248356232174473978185211891975727638059679744.000000
  v_2: -1248356232174473819728886863447052450971779072.000000


So the second loop isn't vectorized due to unsafe reordering of fp math.
But the first loop is vectorized, even if the optimization isn't safe to apply.
And this is also reflected in that we get different result for v_1 and v_2.


Is this correct behavior? Should the pragma result in vectorization here?

Note that we get vectorization even with "vectorize_width(3)". So despite
the fact that LV ignores the bad vectorization factor, it consider vectorization
to be "forced".

(I also wonder if "forced" is bad terminology here, if the pragma should be considered as a hint.)

Regards,
Björn Pettersson


More information about the llvm-dev mailing list