[PATCH] D69044: [X86] Allow up to 4 loads per inline memcmp()

David Zarzycki via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Oct 19 23:44:11 PDT 2019


davezarzycki added a comment.

@craig.topper – That does seem to fix the bug. Thanks!

Just FYI everybody, I built LLVM+clang+lld with this change, and I don't see any evidence of more than two AVX512 load pairs being generated outside of one LLVM unit test (which failed until Craig's patch). And yes, one could argue that LLVM/clang/lld aren't representative of normal code, but let's pause on that. I just wanted to test a large source base.

More so, the ratios are consistent with my assertion that inline memcmps have log-normal distribution (with small mean/median/variance). For example, clang has ~3300 XMM load pairs, ~230 YMM load pairs, and ~40 ZMM load pairs. These values are approximate because the script I wrote might have missed something.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69044/new/

https://reviews.llvm.org/D69044





More information about the llvm-commits mailing list