[PATCH] D71374: Improve support of GNU mempcpy
serge via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Dec 13 03:03:14 PST 2019
serge-sans-paille added a comment.
In D71374#1783032 <https://reviews.llvm.org/D71374#1783032>, @Jim wrote:
> I am curious what is difference of code generation after applying your changes?
Before, when compiling
#define _GNU_SOURCE
#include <string.h>
void* foo(void* to, void* from, unsigned n) {
return mempcpy(mempcpy(to, from, n), from, n);
}
We get (clang -O3)
define i8* @foo(i8*, i8*, i32) #0 {
%4 = alloca i8*, align 8
%5 = alloca i8*, align 8
%6 = alloca i32, align 4
store i8* %0, i8** %4, align 8
store i8* %1, i8** %5, align 8
store i32 %2, i32* %6, align 4
%7 = load i8*, i8** %4, align 8
%8 = load i8*, i8** %5, align 8
%9 = load i32, i32* %6, align 4
%10 = zext i32 %9 to i64
%11 = call i8* @mempcpy(i8* %7, i8* %8, i64 %10) #2
%12 = load i8*, i8** %5, align 8
%13 = load i32, i32* %6, align 4
%14 = zext i32 %13 to i64
%15 = call i8* @mempcpy(i8* %11, i8* %12, i64 %14) #2
ret i8* %15
}
And we now get
define dso_local i8* @foo(i8* %to, i8* nocapture readonly %from, i32 %n) local_unnamed_addr #0 {
entry:
%conv = zext i32 %n to i64
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %to, i8* align 1 %from, i64 %conv, i1 false)
%0 = getelementptr i8, i8* %to, i64 %conv
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %0, i8* align 1 %from, i64 %conv, i1 false)
%1 = getelementptr i8, i8* %0, i64 %conv
ret i8* %1
}
Which looks much better to me, esp. as it unlocks memcpy-specific optimisations
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71374/new/
https://reviews.llvm.org/D71374
More information about the cfe-commits
mailing list