[PATCH] D39906: [InstCombine] Allowing GEP Instructions with loop Invariant operands to combine
DIVYA SHANMUGHAN via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 10 09:11:47 PST 2017
DIVYA added a comment.
This patch helps to reduce the number of instructions generated by llvm for aarch64 for the longest_match() hottest functions in zlib-ng library.
Profile output obtained from valgrind for longest_match() function in match.c , after compiling with gcc for x86_64 at -O3
. . . . . . . . . unsigned char *win = s->window;
. . . . . . . . . int cont = 1;
. . . . . . . . . do {
330,962,412 1 1 0 0 0 0 0 0 match = win + cur_match;
330,962,412 0 0 165,481,206 63,723,323 0 0 0 0 if (likely(*(uint16_t*)(match+best_len-1) != scan_end)) {
623,919,768 0 0 155,979,942 86,247,452 0 0 0 0 if ((cur_match = prev[cur_match & wmask]) > limit
308,955,535 0 0 5,387,447 1,860 0 0 0 0 && --chain_length != 0) {
Assembly code before applying the patch.
The code contains 2 adds inside the loop
.LBB0_7: // %do.body37
// in Loop: Header=BB0_8 Depth=2
add x26, x8, x20
add x10, x26, x9
ldurh w10, [x10, #-1]
cmp w10, w28, uxth
b.eq .LBB0_10
After applying the patch
.LBB0_8: // %if.then49
// Parent Loop BB0_5 Depth=1
// => This Inner Loop Header: Depth=2
add x26, x8, x20
ldrh w10, [x26, x9]
cmp w10, w28, uxth
b.ne .LBB0_8
.LBB0_11:
https://reviews.llvm.org/D39906
More information about the llvm-commits
mailing list