[LLVMdev] failed folding with constant array with opt -O3

Wed Sep 10 12:58:59 PDT 2014

Adding some form of alias analysis like -basicaa to the list should 
solve the clobbering issue.

You're probably also interested in some the following to get your code 
completely optimized again: -constprop -simplifycfg -dse

Cheers,
  Roel

On 10/09/14 21:00, Peng Cheng wrote:
> Thanks for your help!  Providing data layout works for the example I have.
>
> I also feel there is still something else going on during the
> investigation.
>
> I have another simple ir, which basically does the same thing as the
> original example except initialize the array element by element instead
> of by an constant array.
>
> ---
> define void @f(i32* nocapture %l0) {
> entry:
>    %fc_ = alloca [1 x i32]
>    %0 = getelementptr inbounds [1 x i32]* %fc_, i32 0, i32 0
>    store i32 1, i32* %0, align 4
>    %1 = getelementptr [1 x i32]* %fc_, i64 0, i64 0
>    %2 = load i32* %1
>    %tobool = icmp eq i32 %2, 0
>    br i1 %tobool, label %4, label %3
>
> ; <label>:3                                       ; preds = %entry
>    store i32 1, i32* %l0
>    br label %4
>
> ; <label>:4                                       ; preds = %entry, %3
>    %storemerge = phi i32 [ 1, %3 ], [ 0, %entry ]
>    store i32 %storemerge, i32* %l0
>    ret void
> }
> ---
>
> For this ir, without target data layout,
> opt -O2 got the expected optmization:
>
> ---
> ; ModuleID = 't1.txt'
>
> ; Function Attrs: nounwind
> define void @f(i32* nocapture %l0) #0 {
>    store i32 1, i32* %l0
>    ret void
> }
>
> attributes #0 = { nounwind }
> ---
>
> By checking the ir after each transformation, I see that gvn removes the
> load expression and the transformations before do nothing.
>
> So, I ran opt -gvn on the ir, but did not get load expression eliminated.
>
> Checking into the gvn code.  Looks like the memory dependency
> computation got different results for the load expression with -O2 or
> -gvn.  With O2, the load is not clobbered, but with gvn alone, it says
> clobbered.
>
> Does that sound expected?
>
>
>
>
>
>
> On Wed, Sep 10, 2014 at 12:50 PM, Philip Reames
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
>     I came in to an email this morning that said basically the same
>     thing for the reduced example we were looking at.  However, the
>     original IR it came from (before hand reduction) had the data layout
>     set correctly, so there's probably still *something* going on.  It's
>     just not what I thought at first.  :)
>
>     Philip
>
>
>
>     On 09/10/2014 02:26 AM, Roel Jordans wrote:
>
>         Looking at the -debug output of opt shows that SROA was skipped
>         due to missing target data.
>
>         Adding something like:
>
>         target datalayout = "e-p:32:32:32-i32:32:32"
>
>         to the top seems sufficient to fix the issue at -O3.
>
>         By defining the size and storage requirements for i32 SROA is
>         capable of rewriting the array load into a constant scalar load
>         which can then be further optimized.
>
>         Cheers,
>           Roel
>
>         On 09/09/14 18:30, Peng Cheng wrote:
>
>             I have the following simplified llvm ir, which basically
>             returns value
>             based on the first value of a constant array.
>
>             ----
>             ; ModuleID = 'simple_ir3.txt'
>
>             @f.b = constant [1 x i32] [i32 1], align 4          ;
>             constant array
>             with value 1 at the first element
>
>             define void @f(i32* nocapture %l0) {
>             entry:
>                 %fc_ = alloca [1 x i32]
>                 %f.b.v = load [1 x i32]* @f.b
>                 store [1 x i32] %f.b.v, [1 x i32]* %fc_
>                 %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0  ; load
>             the first
>             element of the constant array, which is actually 1
>                 %1 = load i32* %0
>                 %tobool = icmp ne i32 %1, 0             ; check the
>             first element to
>             see if it is 1, which is actually always true since the
>             first element of
>             constant array is 1
>                 br i1 %tobool, label %2, label %4
>
>             ; <label>:2               ; true branch
>                 store i32 1, i32* %l0;
>                 %3 = load i32* %l0;
>                 br label %4
>
>             ; <label>:4
>                 %storemerge = phi i32 [ %3, %2 ], [ 0, %entry ]
>                 store i32 %storemerge, i32* %l0
>                 ret void
>             }
>             ---
>
>             I ran opt -O3 simple_ir.txt -S, and got:
>
>             ---
>             ; ModuleID = 'simple_ir3.txt'
>
>             @f.b = constant [1 x i32] [i32 1], align 4
>
>             ; Function Attrs: nounwind
>             define void @f(i32* nocapture %l0) #0 {
>             entry:
>                 %fc_ = alloca [1 x i32]
>                 store [1 x i32] [i32 1], [1 x i32]* %fc_
>                 %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0
>                 %1 = load i32* %0
>                 %tobool = icmp eq i32 %1, 0
>                 br i1 %tobool, label %3, label %2
>
>             ; <label>:2                                       ; preds =
>             %entry
>                 store i32 1, i32* %l0
>                 br label %3
>
>             ; <label>:3                                       ; preds =
>             %entry, %2
>                 %storemerge = phi i32 [ 1, %2 ], [ 0, %entry ]
>                 store i32 %storemerge, i32* %l0
>                 ret void
>             }
>
>             attributes #0 = { nounwind }
>             ---
>
>             I would expect that the constant folding, or some other
>             transformations,
>             would be able to fold the constant to get the following ir:
>
>             ---
>             define void @f(i32* nocapture %l0) #0 {
>                 store i32 1, i32* %l0
>                 ret void
>             }
>             ---
>
>             How could I get the expected optimized ir?  update the
>             original ir, or
>             use different set of transformations?
>
>             Any suggestions or comments?
>
>
>             Thanks,
>             -Peng
>
>
>             _________________________________________________
>             LLVM Developers mailing list
>             LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>             http://llvm.cs.uiuc.edu
>             http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>             <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
>         _________________________________________________
>         LLVM Developers mailing list
>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>         http://llvm.cs.uiuc.edu
>         http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>         <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
>
>     _________________________________________________
>     LLVM Developers mailing list
>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>     <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
>