[libc-commits] [PATCH] D74397: [libc] Adding memcpy implementation for x86_64

Mon Mar 16 11:27:49 PDT 2020

gchatelet marked 6 inline comments as done.
gchatelet added inline comments.

================
Comment at: libc/src/string/CMakeLists.txt:83
+add_subdirectory(${LIBC_MEMCPY_IMPL_FOLDER})
+add_memcpy(memcpy MARCH native)
----------------
sivachandra wrote:
> It seems like this will not ensure the best flags. Is that the intention? If so, why?
It will, `-march=native` enables all the features available on the host.
Why do you think `-march=native` won't get the best flags?

That said if you're on a `skylake-avx512` machine it will not use avx512 instructions. This is because `-mprefer-vector-width` is defaulted to 256 bit width operations (see this [phoronix article](https://www.phoronix.com/scan.php?page=news_item&px=LLVM-Clang-10-AVX512-Change))

Currently, if on a `skylake-avx512` machine the implementation will be the same as the `avx` one. We would have to measure to be sure it's worth forcing `-mprefer-vector-width=512` as well.

================
Comment at: libc/src/string/x86_64/CMakeLists.txt:1
+add_memcpy("memcpy_${LIBC_TARGET_MACHINE}_opt_none" REJECT "${ALL_CPU_FEATURES}")
+add_memcpy("memcpy_${LIBC_TARGET_MACHINE}_opt_sse" REQUIRE "SSE" REJECT "SSE2")
----------------
sivachandra wrote:
> These listings are already under `x86_64`. So, do we need to use `${LIBC_TARGET_MACHINE}`?
This is because we redirect both x86 and x86_64 versions to this folder and they ought to have different names. We should probably rename this folder if it's unclear. Maybe `x86_multi` ?

================
Comment at: libc/test/src/string/CMakeLists.txt:32
+  if(can_run)
+    compute_flags(flags MARCH native)
+    add_libc_unittest(
----------------
sivachandra wrote:
> sivachandra wrote:
> > gchatelet wrote:
> > > abrachet wrote:
> > > > Would you mind explaining this? It seems like ${flags} will just be -march=native, and the work above to find flags gets ignored.
> > > Actually there's no need for flags here, the implementation has already been compiled with the correct flags. The test itself doesn't need them.
> > Same question for me as well. I left a related comment at a different place above.
> Ah, so remove this line then?
This is a two step process:
 - the implementations get compiled with specific flags
 - when testing we retrieve these flags and check whether the current host supports them

If they are compatible, the already compiled `.o` file can run on the host, the test file itself doesn't need to receive the flags (only the implementation)

This will be the same for benchmarking, the benchmarking code does not need to be compiled with `avx` support, only the code under test needs to be.

Do you think it deserves a comment?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74397/new/

https://reviews.llvm.org/D74397