[libc-commits] [PATCH] D132128: [libc] improve {mem|b}cmp for aarch64

Guillaume Chatelet via Phabricator via libc-commits libc-commits at lists.llvm.org
Thu Aug 18 06:59:43 PDT 2022


gchatelet added a comment.

This patch does not seem to radically change the performance of `bcmp` and `memcmp`. It is neutral or slightly negative on Neoverse N1.

  name                                                       old speed               new speed               delta
  BM_Memcpy/0/0  [__llvm_libc::memcpy,memcpy Google A     ]  16.0GB/s ± 4%           15.9GB/s ± 3%    ~           (p=0.127 n=20+20)
  BM_Memcpy/1/0  [__llvm_libc::memcpy,memcpy Google B     ]  6.77GB/s ± 9%           7.00GB/s ± 7%  +3.47%        (p=0.017 n=20+20)
  BM_Memcpy/2/0  [__llvm_libc::memcpy,memcpy Google D     ]  31.0GB/s ± 2%           30.9GB/s ± 2%    ~           (p=0.529 n=20+20)
  BM_Memcpy/3/0  [__llvm_libc::memcpy,memcpy Google L     ]  6.39GB/s ± 9%           6.45GB/s ±10%    ~           (p=0.588 n=20+19)
  BM_Memcpy/4/0  [__llvm_libc::memcpy,memcpy Google M     ]  5.55GB/s ± 8%           5.64GB/s ± 7%    ~           (p=0.398 n=20+20)
  BM_Memcpy/5/0  [__llvm_libc::memcpy,memcpy Google Q     ]  2.77GB/s ±12%           2.77GB/s ± 8%    ~           (p=0.923 n=19+20)
  BM_Memcpy/6/0  [__llvm_libc::memcpy,memcpy Google S     ]  6.51GB/s ± 6%           6.62GB/s ± 5%    ~           (p=0.068 n=20+20)
  BM_Memcpy/7/0  [__llvm_libc::memcpy,memcpy Google U     ]  8.08GB/s ±10%           7.84GB/s ± 9%    ~           (p=0.059 n=18+20)
  BM_Memcpy/8/0  [__llvm_libc::memcpy,memcpy Google W     ]  5.80GB/s ± 5%           5.77GB/s ± 5%    ~           (p=0.565 n=20+20)
  BM_Memcpy/9/0  [__llvm_libc::memcpy,uniform 384 to 4096 ]  41.3GB/s ± 0%           41.3GB/s ± 0%    ~           (p=0.547 n=20+20)
  BM_Memmove/0/0 [__llvm_libc::memmove,memmove Google A   ]  2.34GB/s ± 8%           2.37GB/s ± 9%    ~           (p=0.369 n=20+20)
  BM_Memmove/1/0 [__llvm_libc::memmove,memmove Google B   ]  5.18GB/s ± 2%           5.17GB/s ± 3%    ~           (p=0.583 n=20+20)
  BM_Memmove/2/0 [__llvm_libc::memmove,memmove Google D   ]  10.4GB/s ± 5%           10.3GB/s ± 6%    ~           (p=0.461 n=20+20)
  BM_Memmove/3/0 [__llvm_libc::memmove,memmove Google L   ]  4.39GB/s ± 9%           4.42GB/s ± 8%    ~           (p=0.659 n=20+20)
  BM_Memmove/4/0 [__llvm_libc::memmove,memmove Google M   ]  3.88GB/s ± 4%           3.84GB/s ± 7%    ~           (p=0.383 n=20+20)
  BM_Memmove/5/0 [__llvm_libc::memmove,memmove Google Q   ]  3.57GB/s ±11%           3.51GB/s ±14%    ~           (p=0.461 n=20+20)
  BM_Memmove/6/0 [__llvm_libc::memmove,memmove Google S   ]  6.97GB/s ± 5%           7.00GB/s ± 5%    ~           (p=0.428 n=19+20)
  BM_Memmove/7/0 [__llvm_libc::memmove,memmove Google U   ]  2.95GB/s ±11%           3.00GB/s ±11%    ~           (p=0.583 n=20+20)
  BM_Memmove/8/0 [__llvm_libc::memmove,memmove Google W   ]  5.41GB/s ± 4%           5.41GB/s ± 3%    ~           (p=0.925 n=20+20)
  BM_Memmove/9/0 [__llvm_libc::memmove,uniform 384 to 4096]  34.2GB/s ± 1%           34.1GB/s ± 0%    ~           (p=0.102 n=20+20)
  BM_Memcmp/0/0  [__llvm_libc::memcmp,memcmp Google A     ]  1.50GB/s ± 6%           1.47GB/s ± 5%  -2.07%        (p=0.028 n=20+20)
  BM_Memcmp/1/0  [__llvm_libc::memcmp,memcmp Google B     ]  4.25GB/s ± 3%           4.27GB/s ± 4%    ~           (p=0.565 n=20+20)
  BM_Memcmp/2/0  [__llvm_libc::memcmp,memcmp Google D     ]  2.87GB/s ± 3%           2.85GB/s ± 4%    ~           (p=0.201 n=20+20)
  BM_Memcmp/3/0  [__llvm_libc::memcmp,memcmp Google L     ]  3.14GB/s ± 1%           3.17GB/s ± 1%  +0.98%        (p=0.000 n=19+20)
  BM_Memcmp/4/0  [__llvm_libc::memcmp,memcmp Google M     ]  1.29GB/s ± 8%           1.29GB/s ± 9%    ~           (p=0.620 n=20+20)
  BM_Memcmp/5/0  [__llvm_libc::memcmp,memcmp Google Q     ]  2.57GB/s ± 6%           2.53GB/s ±10%  -1.63%        (p=0.046 n=20+20)
  BM_Memcmp/6/0  [__llvm_libc::memcmp,memcmp Google S     ]  3.80GB/s ± 2%           3.80GB/s ± 3%    ~           (p=0.835 n=19+20)
  BM_Memcmp/7/0  [__llvm_libc::memcmp,memcmp Google U     ]  2.78GB/s ± 4%           2.74GB/s ± 3%  -1.49%        (p=0.017 n=20+20)
  BM_Memcmp/8/0  [__llvm_libc::memcmp,memcmp Google W     ]  1.54GB/s ± 1%           1.52GB/s ± 1%  -1.67%        (p=0.000 n=20+20)
  BM_Memcmp/9/0  [__llvm_libc::memcmp,uniform 384 to 4096 ]  23.8GB/s ± 0%           23.8GB/s ± 0%  +0.20%        (p=0.000 n=20+20)
  BM_Bcmp/0/0    [__llvm_libc::bcmp,memcmp Google A       ]  1.66GB/s ± 2%           1.58GB/s ± 2%  -4.52%        (p=0.000 n=16+16)
  BM_Bcmp/1/0    [__llvm_libc::bcmp,memcmp Google B       ]  4.13GB/s ± 3%           4.12GB/s ± 3%    ~           (p=0.435 n=19+19)
  BM_Bcmp/2/0    [__llvm_libc::bcmp,memcmp Google D       ]  3.01GB/s ± 1%           2.89GB/s ± 4%  -4.09%        (p=0.000 n=17+20)
  BM_Bcmp/3/0    [__llvm_libc::bcmp,memcmp Google L       ]  3.14GB/s ± 1%           3.07GB/s ± 1%  -2.22%        (p=0.000 n=19+20)
  BM_Bcmp/4/0    [__llvm_libc::bcmp,memcmp Google M       ]  1.29GB/s ± 7%           1.23GB/s ± 9%    ~           (p=0.081 n=20+20)
  BM_Bcmp/5/0    [__llvm_libc::bcmp,memcmp Google Q       ]  2.46GB/s ± 4%           2.32GB/s ± 8%  -5.55%        (p=0.000 n=20+19)
  BM_Bcmp/6/0    [__llvm_libc::bcmp,memcmp Google S       ]  3.81GB/s ± 3%           3.75GB/s ± 3%  -1.55%        (p=0.002 n=20+20)
  BM_Bcmp/7/0    [__llvm_libc::bcmp,memcmp Google U       ]  2.81GB/s ± 3%           2.72GB/s ± 1%  -3.18%        (p=0.000 n=20+17)
  BM_Bcmp/8/0    [__llvm_libc::bcmp,memcmp Google W       ]  1.57GB/s ± 1%           1.52GB/s ± 1%  -3.23%        (p=0.000 n=19+19)
  BM_Bcmp/9/0    [__llvm_libc::bcmp,uniform 384 to 4096   ]  23.0GB/s ± 0%           23.0GB/s ± 0%  -0.28%        (p=0.000 n=20+20)
  BM_Memset/0/0  [__llvm_libc::memset,memset Google A     ]  10.3GB/s ± 3%           10.4GB/s ± 5%    ~           (p=0.224 n=19+20)
  BM_Memset/1/0  [__llvm_libc::memset,memset Google B     ]  8.64GB/s ± 8%           8.64GB/s ± 6%    ~           (p=0.967 n=20+19)
  BM_Memset/2/0  [__llvm_libc::memset,memset Google D     ]  26.1GB/s ± 2%           26.2GB/s ± 2%    ~           (p=0.081 n=20+20)
  BM_Memset/3/0  [__llvm_libc::memset,memset Google L     ]  10.9GB/s ± 4%           11.0GB/s ± 4%    ~           (p=0.174 n=20+20)
  BM_Memset/4/0  [__llvm_libc::memset,memset Google M     ]  20.1GB/s ± 2%           20.0GB/s ± 3%    ~           (p=0.383 n=20+20)
  BM_Memset/5/0  [__llvm_libc::memset,memset Google Q     ]  13.4GB/s ± 3%           13.1GB/s ± 4%  -2.07%        (p=0.001 n=20+20)
  BM_Memset/6/0  [__llvm_libc::memset,memset Google S     ]  10.5GB/s ± 4%           10.6GB/s ± 4%    ~           (p=0.253 n=20+20)
  BM_Memset/7/0  [__llvm_libc::memset,memset Google U     ]  9.51GB/s ± 4%           9.47GB/s ± 4%    ~           (p=0.461 n=20+20)
  BM_Memset/8/0  [__llvm_libc::memset,memset Google W     ]  12.1GB/s ± 2%           12.1GB/s ± 2%    ~           (p=0.512 n=20+20)
  BM_Memset/9/0  [__llvm_libc::memset,uniform 384 to 4096 ]  44.8GB/s ± 0%           44.8GB/s ± 0%    ~           (p=0.620 n=20+20)
  BM_Bzero/0/0   [__llvm_libc::bzero,memset Google A      ]  10.6GB/s ± 4%           10.6GB/s ± 4%    ~           (p=0.383 n=20+20)
  BM_Bzero/1/0   [__llvm_libc::bzero,memset Google B      ]  8.76GB/s ± 4%           8.83GB/s ± 4%    ~           (p=0.495 n=20+20)
  BM_Bzero/2/0   [__llvm_libc::bzero,memset Google D      ]  28.5GB/s ± 2%           28.5GB/s ± 2%    ~           (p=0.341 n=20+20)
  BM_Bzero/3/0   [__llvm_libc::bzero,memset Google L      ]  10.9GB/s ± 3%           10.9GB/s ± 2%    ~           (p=0.778 n=19+17)
  BM_Bzero/4/0   [__llvm_libc::bzero,memset Google M      ]  21.6GB/s ± 2%           21.4GB/s ± 3%  -0.96%        (p=0.024 n=20+19)
  BM_Bzero/5/0   [__llvm_libc::bzero,memset Google Q      ]  13.4GB/s ± 3%           13.3GB/s ± 5%    ~           (p=0.121 n=20+20)
  BM_Bzero/6/0   [__llvm_libc::bzero,memset Google S      ]  10.4GB/s ± 4%           10.4GB/s ± 4%    ~           (p=0.445 n=20+20)
  BM_Bzero/7/0   [__llvm_libc::bzero,memset Google U      ]  9.45GB/s ± 5%           9.48GB/s ± 5%    ~           (p=0.512 n=20+20)
  BM_Bzero/8/0   [__llvm_libc::bzero,memset Google W      ]  12.1GB/s ± 2%           12.1GB/s ± 2%    ~           (p=0.841 n=20+20)
  BM_Bzero/9/0   [__llvm_libc::bzero,uniform 384 to 4096  ]  71.4GB/s ± 0%           71.4GB/s ± 0%    ~           (p=0.904 n=20+20)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132128/new/

https://reviews.llvm.org/D132128



More information about the libc-commits mailing list