[all-commits] [llvm/llvm-project] 9ec6eb: [libc] Improve memcmp latency and codegen

Guillaume Chatelet via All-commits all-commits at lists.llvm.org
Mon Jun 5 02:46:23 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9ec6ebd3ceabb29482aa18a64b943788b65223dc
      https://github.com/llvm/llvm-project/commit/9ec6ebd3ceabb29482aa18a64b943788b65223dc
  Author: Guillaume Chatelet <gchatelet at google.com>
  Date:   2023-06-05 (Mon, 05 Jun 2023)

  Changed paths:
    M libc/src/string/CMakeLists.txt
    M libc/src/string/memory_utils/CMakeLists.txt
    M libc/src/string/memory_utils/bcmp_implementations.h
    M libc/src/string/memory_utils/memcmp_implementations.h
    M libc/src/string/memory_utils/op_generic.h
    M libc/src/string/memory_utils/op_x86.h
    M libc/src/string/memory_utils/utils.h
    M libc/src/string/memory_utils/x86_64/memcmp_implementations.h
    M libc/test/src/string/memory_utils/op_tests.cpp
    M utils/bazel/llvm-project-overlay/libc/BUILD.bazel

  Log Message:
  -----------
  [libc] Improve memcmp latency and codegen

This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717




More information about the All-commits mailing list