<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jul 21, 2016, at 8:28 PM, Qingkun Meng <<a href="mailto:mengqingkun1988@gmail.com" class="">mengqingkun1988@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class=""><span style="font-size:14px" class="">>if you are interested about what gets actually *executed*, some of these computation will be folded in the addressing mode depending on the architecture</span><span style="font-size:14px" class=""><br class=""></span></div><div class=""><span style="font-size:14px" class=""><br class=""></span></div><div class=""><span style="font-size:14px" class="">If I just want to collect array index manipulation lexically, is there any reliable solution?</span></div></div></div></div></blockquote><div><br class=""></div><div>It depends what you expect exactly. What would be the ideal output for you on the example you provided before?</div><div>Also what is the use-case? (I.e. *why* do you want this information).</div><br class=""></div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class=""><span style="font-size:14px" class=""><br class=""></span></div><div class=""><span style="font-size:14px" class="">By noting this</span></div><span style="font-size:14px" class="">>Some people are doing these kind of analyses using debug info to map back to the source code</span><br class=""><div class="gmail_extra">do you mean reversing to source code from LLVM IR? Is there any open source project? I am very appreciated you could refer it to me.</div></div></div></div></blockquote><div><br class=""></div><div>I meant debug information as what clang generates with -g.</div><div>For instance, try with a simple example:</div><div><br class=""></div><div>$ cat test.c<br class="">int foo(int a, int b) {<br class=""> return a + b;<br class="">}<br class=""><br class=""></div><div>And look at the difference in the output when compiled with -g or not (i.e. `clang -emit-llvm -S test.c -O3 -o -` and `clang -emit-llvm -S test.c -O3 -o - -g`).</div><div>In the first you’ll get something like:</div><div><br class=""></div><div>define i32 @foo(i32, i32) #0 {<br class=""> %3 = add nsw i32 %1, %0<br class=""> ret i32 %3<br class="">}</div><div><br class=""></div><div>while in the second case it will look like (stripped to keep only the relevant informations):</div><div><br class=""></div><div>define i32 @foo(i32, i32) #0 !dbg !7 {<br class=""> tail call void @llvm.dbg.value(metadata i32 %0, i64 0, metadata !12, metadata !14), !dbg !15<br class=""> tail call void @llvm.dbg.value(metadata i32 %1, i64 0, metadata !13, metadata !14), !dbg !16<br class=""> %3 = add nsw i32 %1, %0, !dbg !17<br class=""> ret i32 %3, !dbg !18<br class="">}<br class="">[…]</div><div>!1 = !DIFile(filename: "test.c", directory: “…")</div><div>[…]<br class="">!7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !8, isLocal: false, isDefinition: true, scopeLine: 1, flags: DIFlagPrototyped, isOptimized: true, unit: !0, variables: !11)</div><div>[….]<br class="">!12 = !DILocalVariable(name: "a", arg: 1, scope: !7, file: !1, line: 1, type: !10)<br class="">!13 = !DILocalVariable(name: "b", arg: 2, scope: !7, file: !1, line: 1, type: !10)<br class="">!14 = !DIExpression()<br class="">!15 = !DILocation(line: 1, column: 13, scope: !7)<br class="">!16 = !DILocation(line: 1, column: 20, scope: !7)<br class="">!17 = !DILocation(line: 2, column: 12, scope: !7)<br class="">!18 = !DILocation(line: 2, column: 3, scope: !7)<br class=""><br class=""></div><div><br class=""></div><div>Now from there you can analyze the IR and see that there is an addition for two values (%0 and %1), and the calls to llvm.dbg.value points you to some information about these variables (name, type, source location).</div><div><br class=""></div><div>— </div><div>Mehdi</div><div><br class=""></div><br class=""><div><br class=""></div><div><br class=""></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote">2016-07-22 6:38 GMT+08:00 Mehdi Amini <span dir="ltr" class=""><<a href="mailto:mehdi.amini@apple.com" target="_blank" class="">mehdi.amini@apple.com</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class=""><br class="">
> On Jul 21, 2016, at 5:07 AM, Qingkun Meng via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class="">
><br class="">
><br class="">
> Hi there,<br class="">
><br class="">
> I am a newbie of llvm and here is my question situation. Assume that there is a function F which contains a loop named L, a array b[100]. I want to collect the statistical information of array index operation op(i) (take add and mul simply) of i in the loop L. Pseudocode lists below.<br class="">
><br class="">
> void F(arg1, arg2){<br class="">
> int b[100];<br class="">
> for(int i=0; i<n; i++){<br class="">
> op1(i);<br class="">
> op2(i);<br class="">
> ......<br class="">
> b[op1(i)]=n1;<br class="">
> b[op2(i)]=n2; // n1 and n2 are just common constants<br class="">
> }<br class="">
> }<br class="">
><br class="">
> The code fragment is compiled to LLVM IR, I want to collect how many times are operations (like add and mul) put on i. However the operations are not easily obtained because there are many temp variables mix the variable trace. Does anyone have ideas to solve this or some open source project do this job?<br class="">
<br class="">
</span>In short: there is no reliable way in the absolute. The optimizer will make transformations that completely loses any relationship with the source-code. Also if you are interested about what gets actually *executed*, some of these computation will be folded in the addressing mode depending on the architecture.<br class="">
<br class="">
Some people are doing these kind of analyses using debug info to map back to the source code, it may be enough if you don’t need precise results or results that are accurate with respect to the final optimized binary instruction stream.<br class="">
<br class="">
—<br class="">
<span class=""><font color="#888888" class="">Mehdi<br class="">
<br class="">
</font></span></blockquote></div><br class=""></div></div>
</div></blockquote></div><br class=""></body></html>