[libc-commits] [PATCH] D74397: [libc] Adding memcpy implementation for x86_64

Guillaume Chatelet via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed Feb 19 05:34:11 PST 2020


gchatelet marked 11 inline comments as done.
gchatelet added inline comments.


================
Comment at: libc/src/string/CMakeLists.txt:10
+  COMMENT "Check that the generated code is valid"
+  COMMAND ! (nm --undefined-only $<TARGET_PROPERTY:${target_name},OBJECT_FILE_RAW> | grep U && echo "Implementation contains undefined symbols") 
+)
----------------
abrachet wrote:
> [[ https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-nm/llvm-nm.cpp#L811 | llvm-nm ]] (the nm that ships with MacOS) for mach-o files prints just the symbol and not its value or type character so grepping for U wouldn't work here (in this very niche case that I just happened to test). If we're using --undefined-only we could just `grep .` perhaps?
thx for letting me know. I guess `grep .` will work.


================
Comment at: libc/src/string/CMakeLists.txt:78
+  compile_memcpy_with_flags(avx512 FLAGS -mavx512f)
+endif()
+
----------------
sivachandra wrote:
> abrachet wrote:
> > sivachandra wrote:
> > > gchatelet wrote:
> > > > This now creates the following memcpy implementations
> > > >  - `__llvm_libc::memcpy_x86_64_avx512`
> > > >  - `__llvm_libc::memcpy_x86_64_avx`
> > > >  - `__llvm_libc::memcpy_x86_64_sse2`
> > > >  - `__llvm_libc::memcpy_x86_64_sse`
> > > >  - `__llvm_libc::memcpy_x86_64_unopt`
> > > > 
> > > > For shared libc, we need an ifunc like trampoline to select the correct version.
> > > > For static libc, we need to select an implementation
> > > > 
> > > > @sivachandra how do you see this kind if code generation integrate into the more general cmake functions from `libc/cmake/modules/LLVMLibCRules.cmake`?
> > > > I expect other memory functions to follow the same scheme.
> > > Instead of building all the possible implementations, could we use the CMake [[ https://cmake.org/cmake/help/v3.14/command/try_compile.html | `try_compile` ]] and/or [[ https://cmake.org/cmake/help/v3.14/command/try_run.html | `try_run` ]] command to sniff out the best flags to use? I think `try_run` is more appropriate as I expect that we need to run the `cpuid` instruction? Also, compilers have a convenience macro `__cpuid` to run this instruction on x86/x86_64?
> > > 
> > > BTW, one can have ifuncs in static libraries as well. But, I do understand we want to avoid the overhead of the added indirection, so sniffing out at configure time is the best. If we can setup something for configure time sniffing, I believe we should be able use it (may be with straightforward extension/modification iff required) to use as the ifunc selector as well.
> > I think this is going to get quickly outside of the scope of this patch. It would probably be better to work on this in a separate patch.
> If there are any infrastructure pieces required to build a full solution, then we can do them separately as a prerequisite. However, if correct compile options are critical to memcpy implementation, then any code added to deduce them should belong to this patch. I agree that the patch will start to become too big. But, I think the additions to memcpy_utils and their tests can be split out to another prerequisite patch.
@sivachandra `try_compile` and `try_run` will only work when the host is also the target of the build. For instance it won't work when cross compiling.

There are a few dimensions here:
1. static library
    - autodetect which implementation to use with `try_compile` / `try_run`
    - manually choose the implementation for cross compilation
2. dynamic library
    - select a set of implementations to include in the library and the associated dispatch code
3. test
    - test all the implementations that can run on the host

I agree with both of you that this is getting too big for this patch.
Let me do my homework and come back with the prerequisite patches.


================
Comment at: libc/src/string/CMakeLists.txt:78
+  compile_memcpy_with_flags(avx512 FLAGS -mavx512f)
+endif()
+
----------------
gchatelet wrote:
> sivachandra wrote:
> > abrachet wrote:
> > > sivachandra wrote:
> > > > gchatelet wrote:
> > > > > This now creates the following memcpy implementations
> > > > >  - `__llvm_libc::memcpy_x86_64_avx512`
> > > > >  - `__llvm_libc::memcpy_x86_64_avx`
> > > > >  - `__llvm_libc::memcpy_x86_64_sse2`
> > > > >  - `__llvm_libc::memcpy_x86_64_sse`
> > > > >  - `__llvm_libc::memcpy_x86_64_unopt`
> > > > > 
> > > > > For shared libc, we need an ifunc like trampoline to select the correct version.
> > > > > For static libc, we need to select an implementation
> > > > > 
> > > > > @sivachandra how do you see this kind if code generation integrate into the more general cmake functions from `libc/cmake/modules/LLVMLibCRules.cmake`?
> > > > > I expect other memory functions to follow the same scheme.
> > > > Instead of building all the possible implementations, could we use the CMake [[ https://cmake.org/cmake/help/v3.14/command/try_compile.html | `try_compile` ]] and/or [[ https://cmake.org/cmake/help/v3.14/command/try_run.html | `try_run` ]] command to sniff out the best flags to use? I think `try_run` is more appropriate as I expect that we need to run the `cpuid` instruction? Also, compilers have a convenience macro `__cpuid` to run this instruction on x86/x86_64?
> > > > 
> > > > BTW, one can have ifuncs in static libraries as well. But, I do understand we want to avoid the overhead of the added indirection, so sniffing out at configure time is the best. If we can setup something for configure time sniffing, I believe we should be able use it (may be with straightforward extension/modification iff required) to use as the ifunc selector as well.
> > > I think this is going to get quickly outside of the scope of this patch. It would probably be better to work on this in a separate patch.
> > If there are any infrastructure pieces required to build a full solution, then we can do them separately as a prerequisite. However, if correct compile options are critical to memcpy implementation, then any code added to deduce them should belong to this patch. I agree that the patch will start to become too big. But, I think the additions to memcpy_utils and their tests can be split out to another prerequisite patch.
> @sivachandra `try_compile` and `try_run` will only work when the host is also the target of the build. For instance it won't work when cross compiling.
> 
> There are a few dimensions here:
> 1. static library
>     - autodetect which implementation to use with `try_compile` / `try_run`
>     - manually choose the implementation for cross compilation
> 2. dynamic library
>     - select a set of implementations to include in the library and the associated dispatch code
> 3. test
>     - test all the implementations that can run on the host
> 
> I agree with both of you that this is getting too big for this patch.
> Let me do my homework and come back with the prerequisite patches.
Let me think about it, I'll get back with the required building blocks.


================
Comment at: libc/src/string/memcpy_arch_specific.h.def:33-34
+// - As compilers and processors get better, the generated code is improved
+// with
+//   little change on the code side.
+static void memcpy_no_return(char *__restrict dst, const char *__restrict src,
----------------
abrachet wrote:
> Seems like clang-format interfered a little bit here.
Good catch


================
Comment at: libc/src/string/memcpy_entrypoint.cpp:1
+//===--------------------- Implementation of memcpy -----------------------===//
+//
----------------
abrachet wrote:
> Must this be in its own file? Why not in src/string/memcpy.cpp?
On the one hand we need to pick one implementation to be the `LLVM_LIBC_ENTRYPOINT(memcpy)` and on the other hand we want to generate multiple implementations to be able to test them and embed them into the library.

I don't like having two files for this purpose, this is a dirty hack that is not to be submitted, I'll come back with a better solution.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74397/new/

https://reviews.llvm.org/D74397





More information about the libc-commits mailing list