[PATCH] D150218: [ConstantMerge] Only merge constant w/unnamed_addr
Gulfem Savrun Yeniceri via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 9 11:53:26 PDT 2023
gulfem created this revision.
Herald added subscribers: hoy, ormris, hiraditya, arichardson.
Herald added a project: All.
gulfem requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
Currently, ConstantMergePass merges an unnamed_addr with a
non-unnamed_addr constant as it is explained in LangRef.
"Note that a constant with significant address can be merged with a
unnamed_addr constant, the result being a constant whose address is
significant."
https://llvm.org/docs/LangRef.html#global-variables
This can result in a situation where Clang vioalates C semantics, and
here is a small reproducer to explain the problem:
const char foo_string[] = "foo";
const char* foo_func(void) { return "foo"; }
int is_foo(const char* p) { return p == foo_string; }
int main() {
printf("is_foo: %d\n", is_foo("foo"));
}
When we compile with -O0, where ConstantMerge is not applied, Clang
and GCC have the same result.
clang -O0 foo.c -o foo
./foo
is_foo: 0
gcc -O0 foo.c -o foo
./foo
is_foo: 0
When we compile -O1 and higher, where ConstantMerge is applied, Clang
and GCC have different results.
clang -O3 foo.c -o foo
./foo
is_foo: 1
gcc -O3 foo.c -o foo
./foo
is_foo: 0
Here's the IR before ConstantMergePass pass:
@.str = private unnamed_addr constant [4 x i8] c"foo\00", align 1
@_ZL10foo_string = internal constant [4 x i8] c"foo\00", align 1
@.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00",
align 1
; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 {
ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0)
}
; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0)
local_unnamed_addr #0 {
%2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]*
@_ZL10foo_string, i64 0, i64 0)
%3 = zext i1 %2 to i32
ret i32 %3
}
; Function Attrs: mustprogress nofree norecurse nounwind uwtable
define noundef i32 @main() local_unnamed_addr #1 {
%1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull
dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]*
@.str.1, i64 0, i64 0), i32 noundef zext (i1 icmp eq (i8* getelementptr
inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i8* getelementptr
inbounds ([4 x i8], [4 x i8]* @_ZL10foo_string, i64 0, i64 0)) to i32))
ret i32 0
}
MergeConstantPass merges `_ZL10foo_string` into `.str`, where it merges
a non-`unnamed_addr` constant into an `unnamed_addr` constant.
- IR Dump After ConstantMergePass on [module] ***
@.str = private constant [4 x i8] c"foo\00", align 1
@.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00",
align 1
; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 {
ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0)
}
; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0)
local_unnamed_addr #0 {
%2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]* @.str,
i64 0, i64 0)
%3 = zext i1 %2 to i32
ret i32 %3
}
; Function Attrs: mustprogress nofree norecurse nounwind uwtable
define noundef i32 @main() local_unnamed_addr #1 {
%1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull
dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]*
@.str.1, i64 0, i64 0), i32 noundef 1)
ret i32 0
}
This transformation violates the following C pointer semantics:
"Two pointers compare equal if and only if both are null pointers,
both are pointers to the same object (including a pointer to an object
and a subobject at its beginning) or function, both are pointers to
one past the last element of the same array object, or one is a pointer
to one past the end of one array object and the other is a pointer to
the start of a different array object that happens to immediately
follow the first array object in the address space."
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf
So, this patch changes ConstantMerge pass to only allow merging when
when a constant is marked with `unnamed_addr` attribute.
I also found an old GitHub issue where a similar issue about invalid
constant merging is explained.
https://github.com/llvm/llvm-project/issues/9299
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D150218
Files:
llvm/lib/Transforms/IPO/ConstantMerge.cpp
llvm/test/Transforms/ConstantMerge/2011-01-15-EitherOrder.ll
llvm/test/Transforms/ConstantMerge/merge-both.ll
llvm/test/Transforms/ConstantMerge/merge-dbg.ll
llvm/test/Transforms/ConstantMerge/unnamed-addr.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D150218.520778.patch
Type: text/x-patch
Size: 5100 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230509/af7644df/attachment.bin>
More information about the llvm-commits
mailing list