[libc-commits] [libc] [libc][docs] codify Policy on Assembler Sources (PR #88185)

Guillaume Chatelet via libc-commits libc-commits at lists.llvm.org
Tue Apr 16 02:19:37 PDT 2024


================
@@ -186,3 +186,32 @@ We expect contributions to be free of warnings from the `minimum supported
 compiler versions`__ (and newer).
 
 .. __: https://libc.llvm.org/compiler_support.html#minimum-supported-versions
+
+Policy on Assembler sources
+===========================
+
+Coding in high level languages such as C++ provides benefits relative to low
+level languages like Assembler, such as:
+
+* Improved safety
+* Instrumentation
+
+  * Code coverage
+  * Profile collection
+* Sanitization
+* Debug info
+
+While its not impossible to have Assembler code that correctly provides all of
+the above, we do not wish to maintain such Assembler sources in llvm-libc.
+
+That said, there a few functions provided by llvm-libc that are more difficult
+to implement or maintain in C++ than Assembler. We do use inline or out-of-line
+Assembler in an intentionally minimal set of places; typically places where the
+stack or individual register state must be manipulated very carefully for
+correctness.
+
+Contributions adding Assembler for performance are not welcome. Contributors
----------------
gchatelet wrote:

Most of the time the semantics are accessible through compiler builtins (e.g. [prefetch instructions](https://github.com/llvm/llvm-project/blob/d34a2c2adb2a4f1dc262c5756d3725caa4ea2571/libc/src/string/memory_utils/utils.h#L344-L350)) but there are situations where its impossible to access a specific CPU instruction from C++ (e.g. [`rep movsb`](https://github.com/llvm/llvm-project/blob/d34a2c2adb2a4f1dc262c5756d3725caa4ea2571/libc/src/string/memory_utils/op_x86.h#L54), [`dc zva`](https://github.com/llvm/llvm-project/blob/d34a2c2adb2a4f1dc262c5756d3725caa4ea2571/libc/src/string/memory_utils/op_aarch64.h#L37-L43)).

In principle, I agree with @dpxf that we should change the compiler and make sure we have a way to reach out to special semantics from the frontend (e.g. [introduction of `__builtin_memcpy_inline`](https://clang.llvm.org/docs/LanguageExtensions.html#guaranteed-inlined-copy)) but in practice, the fact that we have to support old versions of both Clang **and** GCC makes it much harder. For instance, `inline_memcpy` has a [special path for Clang and a fallback for GCC](https://github.com/llvm/llvm-project/blob/9141e1c24f87e5735bc4178a018eba4bdf2750aa/libc/src/string/memory_utils/utils.h#L82-L100). I'm not digging into the details here but this can lead to suboptimal codegen for GCC when generating `memcpy`, it also requires that the GCC version is compiled with `-fno-builtin-memcpy`, where the Clang version isn't. So, if -in principle- we should improve the compiler, the effort to get both Clang and GCC inline on these intrinsics is substantial, notwithstanding the compiler release cycles, fallback code and build system**s** flags we'll have to maintain to support older versions.

So practically, for these admittedly rare cases, I think inline `asm` strikes a good balance between portability, maintenance and functionality. The layered approach from https://github.com/llvm/llvm-project/pull/88185#discussion_r1561280534 makes perfect sense to me. How about also requiring that such PRs containing `asm` statements should always be reviewed by at least two project maintainers?

https://github.com/llvm/llvm-project/pull/88185


More information about the libc-commits mailing list