<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 22, 2015 at 11:48 PM, Tyler Nowicki <span dir="ltr"><<a href="mailto:tnowicki@apple.com" target="_blank">tnowicki@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Perhaps we should let the user specify vectorize without an option ‘#pragma loop vectorize’ to say ‘warn me if the structure of the loop cannot be vectorized otherwise select the fastest code according to the cost-model’.</blockquote></div><br>This sounds like a great idea.</div><div class="gmail_extra"><br></div><div class="gmail_extra">I would also like to get a full diagnostic about why vectorization failed. That is, have -Rpass-missed=loop-vectorize and -Rpass-analysis=loop-vectorize enabled by default.</div><div class="gmail_extra"><br><span style="font-size:12.8000001907349px">> Its possible there is a bug in the code for generating the warning or we missed generating it somewhere. Could you provide the loop body?<br></span><br>I provide a minimal example below (in Appendix A). It seems the diagnostic triggers or not depending on the optimization level selected, which makes sense now that I looked more into it. For example:<br><br>When compiling with O0, O2, O3, and Ofast I get no diagnostic, and in fact, the loop is not vectorized because it is actually eliminated.<br>When compiling with O1 I get by default without passing any extra flags the diagnostic I wanted: <br><br>> file.cpp:13:1: error: loop not vectorized: failed explicitly specified loop<br>>      vectorization [-Werror,-Wpass-failed]<br>>}<br>>^<br>><br>>1 error generated.<br><br>It is also not hard to get "no diagnostic" even for loops that could be vectorized (godbolt+assembly: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__goo.gl_nfeXOX&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=CnzuN65ENJ1H9py9XLiRvC_UQz6u3oG6GUNn7_wosSM&m=g_WSoK5Z7vCnYHG-3UjBBGARFUf8XzopRNGMm__85tc&s=3YVH2TMLdaX0lVfVXu-gWFt3728AhBlri16eq5uIFiI&e=" target="_blank">http://goo.gl/nfeXOX</a>). In that example, the code is vectorized for a size of 100. But changing the size of 8 removes the vectorization "silently". This is good and makes sense. For a size of 8 it is probably not worth it to vectorize the loop, and whatever cost model the vectorizer uses knows this. <br><br><br>Thanks all of you for your help. It is greatly appreciated.<br><br><br>Appendix A: source code<br><br><div class="gmail_extra">#include <iostream></div><div class="gmail_extra"><br></div><div class="gmail_extra">constexpr int indirection(int *__restrict a, int i) { return a[i]; }</div><div class="gmail_extra"><br></div><div class="gmail_extra">constexpr int foo(int start) {</div><div class="gmail_extra">  int count = start;</div><div class="gmail_extra">  int a[8] = {0, 1, 2, 3, 4, 5, 6, 7};</div><div class="gmail_extra">#pragma clang loop vectorize(enable)</div><div class="gmail_extra">  for (int i = 0; i != 8; ++i) {</div><div class="gmail_extra">    count += indirection(a, i);</div><div class="gmail_extra">  }</div><div class="gmail_extra">  return count;</div><div class="gmail_extra">}</div><div class="gmail_extra"><br></div><div class="gmail_extra">int main() {</div><div class="gmail_extra">  int start = 0;</div><div class="gmail_extra">  std::cin >> start;</div><div class="gmail_extra">  std::cout << foo(start);</div><div class="gmail_extra">  return 0;</div><div class="gmail_extra">}<br></div><div class="gmail_extra"><br></div></div><div class="gmail_extra"><br></div></div>