[llvm-bugs] [Bug 34972] New: frontend optimization of constant aggregate initializers can pessimize final code
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Oct 16 23:56:25 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=34972
Bug ID: 34972
Summary: frontend optimization of constant aggregate
initializers can pessimize final code
Product: clang
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: -New Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: benjamin at benjamin.pe
CC: llvm-bugs at lists.llvm.org
InstCombine knows some good tricks for constant aggregates. It can eliminate
copies of read-only aggregates onto the stack as well as turn table lookups
into inline comparisons. Unfortunately, the way clang emits aggregate constants
can sometimes foil these optimizations.
To illustrate, here is a very silly function for checking whether a char
parameter is 3:
int equals_3(char c) {
char table[] = {0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0};
return c >= 0 && table[(unsigned char)c];
}
This generates the IR (under -O3):
; Function Attrs: nounwind readnone uwtable
define i32 @equals_3(i8 signext) local_unnamed_addr #0 {
%2 = alloca [128 x i8], align 16
%3 = getelementptr inbounds [128 x i8], [128 x i8]* %2, i64 0, i64 0
call void @llvm.lifetime.start.p0i8(i64 128, i8* nonnull %3) #2
call void @llvm.memset.p0i8.i64(i8* nonnull %3, i8 0, i64 128, i32 16, i1
false)
%4 = getelementptr inbounds [128 x i8], [128 x i8]* %2, i64 0, i64 3
store i8 1, i8* %4, align 1
%5 = icmp sgt i8 %0, -1
br i1 %5, label %6, label %12
; <label>:6: ; preds = %1
%7 = zext i8 %0 to i64
%8 = getelementptr inbounds [128 x i8], [128 x i8]* %2, i64 0, i64 %7
%9 = load i8, i8* %8, align 1, !tbaa !2
%10 = icmp ne i8 %9, 0
%11 = zext i1 %10 to i32
br label %12
; <label>:12: ; preds = %6, %1
%13 = phi i32 [ 0, %1 ], [ %11, %6 ]
call void @llvm.lifetime.end.p0i8(i64 128, i8* nonnull %3) #2
ret i32 %13
}
clang has decided to emit the table as a memset of the stack followed by a
store. This obfuscates the initialization enough that InstCombine won't improve
the code.
Amusingly, by inverting the table we can evade the memset "optimization" and
get excellent final code:
int equals_3(char c) {
char table[] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1};
return c >= 0 && table[(unsigned char)c];
}
define i32 @equals_3(i8 signext) local_unnamed_addr #0 {
%2 = icmp eq i8 %0, 3
%3 = zext i1 %2 to i32
ret i32 %3
}
Making the table a const variable also generates good code because the frontend
emits the table directly as a static global (which incidentally can be
non-conforming; see bug 18538).
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171017/3a495aff/attachment-0001.html>
More information about the llvm-bugs
mailing list