[libc-commits] [PATCH] D148717: [libc] Improve memcmp latency and codegen

Guillaume Chatelet via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed Apr 19 08:22:06 PDT 2023


gchatelet created this revision.
gchatelet added a reviewer: courbet.
Herald added subscribers: libc-commits, ecnelises, tschuett, kristof.beyls.
Herald added projects: libc-project, All.
gchatelet requested review of this revision.

This is based on ideas from @nafi to:

- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector operations when wide types are available. We also get rid of byte reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D148717

Files:
  libc/src/string/CMakeLists.txt
  libc/src/string/memory_utils/bcmp_implementations.h
  libc/src/string/memory_utils/memcmp_implementations.h
  libc/src/string/memory_utils/op_generic.h
  libc/src/string/memory_utils/op_x86.h
  libc/src/string/memory_utils/utils.h
  libc/src/string/memory_utils/x86_64/memcmp_implementations.h
  libc/test/src/string/memory_utils/op_tests.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D148717.514965.patch
Type: text/x-patch
Size: 59232 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libc-commits/attachments/20230419/6f932b7e/attachment-0001.bin>


More information about the libc-commits mailing list