[libc-commits] [PATCH] D74397: [libc] Adding memcpy implementation for x86_64

Guillaume Chatelet via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed Mar 18 09:47:31 PDT 2020

gchatelet added inline comments.

Comment at: libc/src/string/CMakeLists.txt:83
+add_memcpy(memcpy MARCH native)
sivachandra wrote:
> gchatelet wrote:
> > sivachandra wrote:
> > > It seems like this will not ensure the best flags. Is that the intention? If so, why?
> > It will, `-march=native` enables all the features available on the host.
> > Why do you think `-march=native` won't get the best flags?
> > 
> > That said if you're on a `skylake-avx512` machine it will not use avx512 instructions. This is because `-mprefer-vector-width` is defaulted to 256 bit width operations (see this [phoronix article](https://www.phoronix.com/scan.php?page=news_item&px=LLVM-Clang-10-AVX512-Change))
> > 
> > Currently, if on a `skylake-avx512` machine the implementation will be the same as the `avx` one. We would have to measure to be sure it's worth forcing `-mprefer-vector-width=512` as well.
> I was of the opinion that the compilers do not have the complete set of capabilities available to them and that is why we have facilities like  `HWCAP`, `cpuid` etc. But, you are the expert, and if you say what you have is enough, I will take it :)
With `-march=native` the compiler will introspect the CPU with [cpuid](https://github.com/llvm/llvm-project/blob/master/clang/lib/Headers/cpuid.h) and detect the available capabilities. We're deferring to the compiler here.

Now for shared libraries and runtime dispatch we'll have to provide such code but we're not there yet.

Comment at: libc/test/src/string/CMakeLists.txt:32
+  if(can_run)
+    compute_flags(flags MARCH native)
+    add_libc_unittest(
sivachandra wrote:
> gchatelet wrote:
> > sivachandra wrote:
> > > sivachandra wrote:
> > > > gchatelet wrote:
> > > > > abrachet wrote:
> > > > > > Would you mind explaining this? It seems like ${flags} will just be -march=native, and the work above to find flags gets ignored.
> > > > > Actually there's no need for flags here, the implementation has already been compiled with the correct flags. The test itself doesn't need them.
> > > > Same question for me as well. I left a related comment at a different place above.
> > > Ah, so remove this line then?
> > This is a two step process:
> >  - the implementations get compiled with specific flags
> >  - when testing we retrieve these flags and check whether the current host supports them
> > 
> > If they are compatible, the already compiled `.o` file can run on the host, the test file itself doesn't need to receive the flags (only the implementation)
> > 
> > This will be the same for benchmarking, the benchmarking code does not need to be compiled with `avx` support, only the code under test needs to be.
> > 
> > Do you think it deserves a comment?
> AFAICT, `compute_flags` does not have any side effects. Also, it doesn't look like `${flags}` is used anywhere here. So, why call it at all?
Yes thank you it was a leftover.

  rG LLVM Github Monorepo



More information about the libc-commits mailing list