<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 3, 2016, at 12:31 PM, Gleison Souza <<a href="mailto:gleison14051994@gmail.com" class="">gleison14051994@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><span style="font-size:13px" class="">Hi Mats,</span><br style="font-size:13px" class=""><br style="font-size:13px" class=""><span style="font-size:13px" class="">    so, my overall goal is to insert annotations in the original C</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">source. I produce these annotations after analyzing the bytecode that</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">clangs gives me for that source. Thus, I need debugging information in</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">the bytecode file. What are these annotations? They are comments</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">relating variables which are pointers and their sizes, as program</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">symbols. </span></div></div></blockquote><div><br class=""></div><div>This doesn’t sounds like a good solution to rely on debug information for that, as you may have figured out.</div><div>Have you looked at what Polly is doing do infer such bounds?</div><div><br class=""></div><div>Some alternative more robust scheme would be to use pragmas instead of comment and have the front-end parse these pragma and lower the information you want in something like llvm.assume.</div><div><br class=""></div><div>— </div><div>Mehdi</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><span style="font-size:13px" class="">For instance, if I have a program like this one below:</span><br style="font-size:13px" class=""><br style="font-size:13px" class=""><span style="font-size:13px" class="">void foo(int *v, int N) {</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  int i;</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  for (i = 0; i < N; i++) {</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">    v[i] = 0;</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  }</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">}</span><br style="font-size:13px" class=""><br style="font-size:13px" class=""><span style="font-size:13px" class="">I insert the following comment on this code:</span><br style="font-size:13px" class=""><br style="font-size:13px" class=""><span style="font-size:13px" class="">void foo(int *v, int N) {</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  int i;</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  // v is accessed within the loop, and its size is [0..N-1]</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  for (i = 0; i < N; i++) {</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">    v[i] = 0;</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">  }</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">}</span><br style="font-size:13px" class=""><br style="font-size:13px" class=""><span style="font-size:13px" class="">    My prototype infers that 'v' is a pointer, and that it ranges on</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">the region &v + 0 till &v + N - 1, where 'N' is a variable alive at</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">the beginning of the loop. In the end, my analysis works for many</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">kinds of programs, but I am having problems to infer that 'v' is an</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">array with stable bounds whenever its base pointer (GetElementPtr)</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">comes out of a load instruction which is inside the loop.</span></div></div></blockquote><div><br class=""></div><div>This example, as the one you sent just before (without global variable), does not have a load for the base address in the optimized IR.</div><div><br class=""></div><div>— </div><div>Mehdi</div><div><br class=""></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><span style="font-size:13px" class=""> That's way I</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">would like to be able to hoist out as many loads as I can. The</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">question then is: what is the command line that I should pass to opt</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">that is likely to hoist loads outside loops while preserving debugging</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">info? Mehdi's suggestion got me really close (clang -O1 -S -o - -mllvm</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">-print-after-all), but it is removing debugging information away from</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">the code. Without debugging information, I can't associate names at</span><br style="font-size:13px" class=""><span style="font-size:13px" class="">the bytecode level with names at the C-source level.</span><br style="font-size:13px" class=""><br style="font-size:13px" class=""><span style="font-size:13px" class="">Regards,</span><br class=""><div class=""><span style="font-size:13px" class=""><br class=""></span></div><div class=""><span style="font-size:13px" class="">    Gleison</span></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">2016-02-03 11:06 GMT-02:00 mats petersson <span dir="ltr" class=""><<a href="mailto:mats@planetcatfish.com" target="_blank" class="">mats@planetcatfish.com</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="">Just to be clear, "What are you actually trying to achieve?" was meant to ask what is your overall goal, and given that overall goal - not the transformation of the IR, but "why do you need to do this" - as a simple performance enhancement, or for some other reason?<br class=""><br class=""></div><div class="">And like has been stated, alias-analysis will help here, but some cases, the compiler will not be able to certainly say that no access from the pointer will affect something else, meaning that the compiler NEEDS to take the conservative route and reload the pointer - it doesn't seem to be a -fstrict-alias flag for clang, like that of gcc.<br class=""></div><div class=""><br class="">--<br class=""></div>Mats<br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote"><div class=""><div class="h5">On 3 February 2016 at 12:05, Gleison Souza via llvm-dev <span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span> wrote:<br class=""></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class=""><div class="h5"><div dir="ltr" class=""><div class="">Thanks Mehdi,</div><div class=""><br class=""></div><div class="">I tried to use this, but some debug information can be lost in these optimizations.</div><div class="">I need write in the source file to insert information before the loops, and in</div><div class="">some cases, I'm writing after the loop header.</div><div class=""><br class=""></div><div class="">Please, take a look:</div><div class=""><br class=""></div><div class="">int foo1 (int *a, int *b, int n) {                                               </div><div class="">  int i, s= 0;                                                                   </div><div class="">  for (i = 0; i < n; i++) {                                                      </div><div class="">    s = s * a[i];                                                                </div><div class="">  }                                                                              </div><div class="">                                                                               </div><div class="">  for (i = 0; i < n; i++) {                                                      </div><div class="">    b[i] = a[i] + 3;                                                             </div><div class="">    s += a[i];                                                                   </div><div class="">   }                                                                              </div><div class="">  return s;                                                                      </div><div class="">} </div><div class=""><br class=""></div><div class="">In this case, using the line obtained by this one in the LLVM's IR:</div><div class=""><br class=""></div><div class="">Line = l->getStartLoc().getLine(); </div><div class=""><br class=""></div><div class="">The source file is cloned, and I'm writing my annotations inside the loop now.</div><div class="">I can't do several modifications in the struct of code, just the necessary, thats the problem.</div><div class=""><br class=""></div><div class="">Regards,</div><div class=""><br class=""></div><div class="">    Gleison</div></div><div class=""><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">2016-02-03 1:07 GMT-02:00 Mehdi Amini <span dir="ltr" class=""><<a href="mailto:mehdi.amini@apple.com" target="_blank" class="">mehdi.amini@apple.com</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Feb 2, 2016, at 8:35 AM, Gleison Souza via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class=""><div style="font-size:13px" class="">Dear LLVMers,</div><div style="font-size:13px" class=""><br class=""></div><div style="font-size:13px" class="">    I am trying to implement a particular type of loop optimization, but I am having problems with global variables. To solve this problem, I would like to know if LLVM has some pass that moves loads outside loops. I will illustrate with an example. I want to transform this code below. I am writing in C for readability, but I am analysing LLVM IR:</div><div style="font-size:13px" class=""><br class=""></div><div style="font-size:13px" class="">int *vectorE;</div><div style="font-size:13px" class=""><br class=""></div><div style="font-size:13px" class="">void foo (int n) {       </div><div style="font-size:13px" class="">  int i;</div><div style="font-size:13px" class="">  for (i = 0; i < n; i++)</div><div style="font-size:13px" class="">    vectorE[i] = i;</div><div style="font-size:13px" class="">}</div><div style="font-size:13px" class=""><br class=""></div><div style="font-size:13px" class="">into this one:</div><div style="font-size:13px" class=""><br class=""></div><div style="font-size:13px" class="">int *vectorE;</div><div style="font-size:13px" class=""><br class=""></div><div style="font-size:13px" class="">void foo (int n) {       </div><div style="font-size:13px" class="">  int i;</div><div style="font-size:13px" class="">  int* aux = vectorE;</div><div style="font-size:13px" class="">  for (i = 0; i < n; i++)</div><div style="font-size:13px" class="">    aux[i] = i;</div><div style="font-size:13px" class="">}</div></div></div></blockquote><div class=""><br class=""></div><div class=""><br class=""></div></span><div class="">Have you looked at the output of clang with optimization enabled (even O1)? For this C++ code the optimizer moves the access to the global in the loop preheader, and then the loop itself does not access the global at all, which seems to be what you’re looking for.</div><div class=""><br class=""></div><div class="">Try: clang -O1 -S -o - -mllvm -print-after-all</div><div class=""><br class=""></div><div class="">— </div><span class=""><font color="#888888" class=""><div class="">Mehdi</div><div class=""><br class=""></div><div class=""><br class=""></div></font></span></div></div></blockquote></div><br class=""></div>

</div></div><br class=""></div></div><span class="">_______________________________________________<br class="">

LLVM Developers mailing list<br class="">

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class="">

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class="">

<br class=""></span></blockquote></div><br class=""></div>

</blockquote></div><br class=""></div>

</div></blockquote></div><br class=""></body></html>