[cfe-dev] How to configure clang, to get const functions out of the loop (like on FreeBSD) ?

Nat! via cfe-dev cfe-dev at lists.llvm.org
Tue Aug 15 15:32:28 PDT 2017


I have a pretty severe performance problem.

I would have thought that clang would hoist my __attribute__((const) 
functions out of a loop, if the input to that function is constant too. 
But out of the box it rarely seems to do it.

Here is a simple example:

```
extern __attribute__(( const))  int  foo( int i);

extern int  bar( int i);


int  foobar( void)
{
     int   i;
     int   x;

     x = 0;
     for( i = 0; i < 100; i++)
     {
        x += foo( 0x2373);
        x  = bar( x);
     }
     return( x == 1848);
}
```

When I put this into https://godbolt.org/ for x86-64 trunk with options 
`-O3 -S -emit-llvm` I get:

```
; ...
; <label>:1: ; preds = %1, %0
   %2 = phi i32 [ 0, %0 ], [ %6, %1 ]
   %3 = phi i32 [ 0, %0 ], [ %7, %1 ]
   tail call void @llvm.dbg.value(metadata i32 %3, metadata !12, 
metadata !14), !dbg !16
   tail call void @llvm.dbg.value(metadata i32 %2, metadata !13, 
metadata !14), !dbg !15

   %4 = tail call i32   @foo(int)(i32 9075)   #4, !dbg !19

   %5 = add nsw i32 %4, %2, !dbg !22
   tail call void @llvm.dbg.value(metadata i32 %5, metadata !13, 
metadata !14), !dbg !15
   %6 = tail call i32 @bar(int)(i32 %5), !dbg !23
   tail call void @llvm.dbg.value(metadata i32 %6, metadata !13, 
metadata !14), !dbg !15
   %7 = add nuw nsw i32 %3, 1, !dbg !24
   tail call void @llvm.dbg.value(metadata i32 %7, metadata !12, 
metadata !14), !dbg !16
   tail call void @llvm.dbg.value(metadata i32 %6, metadata !13, 
metadata !14), !dbg !15
   tail call void @llvm.dbg.value(metadata i32 %7, metadata !12, 
metadata !14), !dbg !16
   %8 = icmp eq i32 %7, 100, !dbg !25
   br i1 %8, label %9, label %1, !dbg !17, !llvm.loop !26
; ...
!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, 
producer: "clang version 6.0.0 (trunk 310909)", isOptimized: true, 
runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
```

My const function is clearly in the loop. Somehow the presence of `bar` 
is the problem. If I remove the bar call, the optimizer can even 
collapse the loop.  (I observe the same not just with godbolt, but also 
with my own clang 4.0.0 derivative and Apple's Xcode 8.3.3 clang.)

So I tried a few other versions of clang (like 3.4.5 for instance...) in 
the godbolt explorer, but all exhibited the same behaviour.

But now comes the crazy part, when I do it on FreeBSD with clang-3.4.5 
it works and produces:

```
   %1 = tail call i32 @foo(i32 9075) #3
   br label %2

; <label>:2                                       ; preds = %2, %0
   %x.02 = phi i32 [ 0, %0 ], [ %4, %2 ]
   %i.01 = phi i32 [ 0, %0 ], [ %5, %2 ]
   %3 = add nsw i32 %1, %x.02
   %4 = tail call i32 @bar(i32 %3) #4
   %5 = add nsw i32 %i.01, 1
   %exitcond = icmp eq i32 %5, 100
   br i1 %exitcond, label %6, label %2
...
!0 = metadata !{metadata !"FreeBSD clang version 3.4.1 
(tags/RELEASE_34/dot1-final 208032) 20140512"}
```

So how do I get the desirable FreeBSD clang behaviour ? Do I have to 
configure clang in a special way ?

Ciao
    Nat!










More information about the cfe-dev mailing list