<div dir="ltr">Thanks for the inputs! Â Adding basicaa works. Â And extra transforms got the ideal optimized code.<div><br></div><div>I originally thought GVN has AG dependency on alias analysis, so it should get the expected elimination. Â While digging into the code, it turned out that NoAA is provided to GVN by default. Â Therefore, to get meaningful results from GVN, basicaa is needed.</div><div><br></div><div>In summary, to get the original ir to the optimized code, there are two sets of transformations. Â One is through sroa, which needs the target layout; and the other is using gvn, which needs a nontrivia aa.</div><div><br></div><div>Any thoughts on which one is cheaper in computational cost to eliminate the load? Â sroa seems the one to me.</div><div><br></div><div>Best,</div><div>-Peng</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Sep 10, 2014 at 3:58 PM, Roel Jordans <span dir="ltr"><<a href="mailto:r.jordans@tue.nl" target="_blank">r.jordans@tue.nl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Adding some form of alias analysis like -basicaa to the list should solve the clobbering issue.<br>
<br>
You're probably also interested in some the following to get your code completely optimized again: -constprop -simplifycfg -dse<br>
<br>
Cheers,<br>
 Roel<div><div class="h5"><br>
<br>
On 10/09/14 21:00, Peng Cheng wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
Thanks for your help! Providing data layout works for the example I have.<br>
<br>
I also feel there is still something else going on during the<br>
investigation.<br>
<br>
I have another simple ir, which basically does the same thing as the<br>
original example except initialize the array element by element instead<br>
of by an constant array.<br>
<br>
---<br>
define void @f(i32* nocapture %l0) {<br>
entry:<br>
  %fc_ = alloca [1 x i32]<br>
  %0 = getelementptr inbounds [1 x i32]* %fc_, i32 0, i32 0<br>
  store i32 1, i32* %0, align 4<br>
  %1 = getelementptr [1 x i32]* %fc_, i64 0, i64 0<br>
  %2 = load i32* %1<br>
  %tobool = icmp eq i32 %2, 0<br>
  br i1 %tobool, label %4, label %3<br>
<br>
; <label>:3Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ; preds = %entry<br>
  store i32 1, i32* %l0<br>
  br label %4<br>
<br>
; <label>:4Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ; preds = %entry, %3<br>
  %storemerge = phi i32 [ 1, %3 ], [ 0, %entry ]<br>
  store i32 %storemerge, i32* %l0<br>
  ret void<br>
}<br>
---<br>
<br>
For this ir, without target data layout,<br>
opt -O2 got the expected optmization:<br>
<br>
---<br>
; ModuleID = 't1.txt'<br>
<br>
; Function Attrs: nounwind<br>
define void @f(i32* nocapture %l0) #0 {<br>
  store i32 1, i32* %l0<br>
  ret void<br>
}<br>
<br>
attributes #0 = { nounwind }<br>
---<br>
<br>
By checking the ir after each transformation, I see that gvn removes the<br>
load expression and the transformations before do nothing.<br>
<br>
So, I ran opt -gvn on the ir, but did not get load expression eliminated.<br>
<br>
Checking into the gvn code. Looks like the memory dependency<br>
computation got different results for the load expression with -O2 or<br>
-gvn. With O2, the load is not clobbered, but with gvn alone, it says<br>
clobbered.<br>
<br>
Does that sound expected?<br>
<br>
<br>
<br>
<br>
<br>
<br>
On Wed, Sep 10, 2014 at 12:50 PM, Philip Reames<br></div></div><div><div class="h5">
<<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a> <mailto:<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.<u></u>com</a>>> wrote:<br>
<br>
  I came in to an email this morning that said basically the same<br>
  thing for the reduced example we were looking at. However, the<br>
  original IR it came from (before hand reduction) had the data layout<br>
  set correctly, so there's probably still *something* going on. It's<br>
  just not what I thought at first. :)<br>
<br>
  Philip<br>
<br>
<br>
<br>
  On 09/10/2014 02:26 AM, Roel Jordans wrote:<br>
<br>
    Looking at the -debug output of opt shows that SROA was skipped<br>
    due to missing target data.<br>
<br>
    Adding something like:<br>
<br>
    target datalayout = "e-p:32:32:32-i32:32:32"<br>
<br>
    to the top seems sufficient to fix the issue at -O3.<br>
<br>
    By defining the size and storage requirements for i32 SROA is<br>
    capable of rewriting the array load into a constant scalar load<br>
    which can then be further optimized.<br>
<br>
    Cheers,<br>
     Roel<br>
<br>
    On 09/09/14 18:30, Peng Cheng wrote:<br>
<br>
      I have the following simplified llvm ir, which basically<br>
      returns value<br>
      based on the first value of a constant array.<br>
<br>
      ----<br>
      ; ModuleID = 'simple_ir3.txt'<br>
<br>
      @f.b = constant [1 x i32] [i32 1], align 4     ;<br>
      constant array<br>
      with value 1 at the first element<br>
<br>
      define void @f(i32* nocapture %l0) {<br>
      entry:<br>
        %fc_ = alloca [1 x i32]<br>
        %f.b.v = load [1 x i32]* @f.b<br>
        store [1 x i32] %f.b.v, [1 x i32]* %fc_<br>
        %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0 ; load<br>
      the first<br>
      element of the constant array, which is actually 1<br>
        %1 = load i32* %0<br>
        %tobool = icmp ne i32 %1, 0       ; check the<br>
      first element to<br>
      see if it is 1, which is actually always true since the<br>
      first element of<br>
      constant array is 1<br>
        br i1 %tobool, label %2, label %4<br>
<br>
      ; <label>:2        ; true branch<br>
        store i32 1, i32* %l0;<br>
        %3 = load i32* %l0;<br>
        br label %4<br>
<br>
      ; <label>:4<br>
        %storemerge = phi i32 [ %3, %2 ], [ 0, %entry ]<br>
        store i32 %storemerge, i32* %l0<br>
        ret void<br>
      }<br>
      ---<br>
<br>
      I ran opt -O3 simple_ir.txt -S, and got:<br>
<br>
      ---<br>
      ; ModuleID = 'simple_ir3.txt'<br>
<br>
      @f.b = constant [1 x i32] [i32 1], align 4<br>
<br>
      ; Function Attrs: nounwind<br>
      define void @f(i32* nocapture %l0) #0 {<br>
      entry:<br>
        %fc_ = alloca [1 x i32]<br>
        store [1 x i32] [i32 1], [1 x i32]* %fc_<br>
        %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0<br>
        %1 = load i32* %0<br>
        %tobool = icmp eq i32 %1, 0<br>
        br i1 %tobool, label %3, label %2<br>
<br>
      ; <label>:2                    ; preds =<br>
      %entry<br>
        store i32 1, i32* %l0<br>
        br label %3<br>
<br>
      ; <label>:3                    ; preds =<br>
      %entry, %2<br>
        %storemerge = phi i32 [ 1, %2 ], [ 0, %entry ]<br>
        store i32 %storemerge, i32* %l0<br>
        ret void<br>
      }<br>
<br>
      attributes #0 = { nounwind }<br>
      ---<br>
<br>
      I would expect that the constant folding, or some other<br>
      transformations,<br>
      would be able to fold the constant to get the following ir:<br>
<br>
      ---<br>
      define void @f(i32* nocapture %l0) #0 {<br>
        store i32 1, i32* %l0<br>
        ret void<br>
      }<br>
      ---<br>
<br>
      How could I get the expected optimized ir? update the<br>
      original ir, or<br>
      use different set of transformations?<br>
<br>
      Any suggestions or comments?<br>
<br>
<br>
      Thanks,<br>
      -Peng<br>
<br>
<br></div></div>
      ______________________________<u></u>___________________<br>
      LLVM Developers mailing list<br>
      <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a> <mailto:<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>><br>
      <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
      <a href="http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/__<u></u>mailman/listinfo/llvmdev</a><br>
      <<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvmdev</a>><br>
<br>
    ______________________________<u></u>___________________<br>
    LLVM Developers mailing list<br>
    <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a> <mailto:<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>><br>
    <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
    <a href="http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/__<u></u>mailman/listinfo/llvmdev</a><br>
    <<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvmdev</a>><br>
<br>
<br>
  ______________________________<u></u>___________________<br>
  LLVM Developers mailing list<br>
  <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a> <mailto:<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
  <a href="http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/__<u></u>mailman/listinfo/llvmdev</a><br>
  <<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvmdev</a>><br>
<br>
<br>
</blockquote>
</blockquote></div><br></div>