[libc-commits] [libc] [llvm] [libc][cmake][linux] require new LLVM_LIBC_USE_HOST_KERNEL_HEADERS or LIBC_KERNEL_HEADERS (PR #123820)

Wed Jan 29 10:58:16 PST 2025

================
@@ -216,6 +216,27 @@ if (LIBC_TARGET_OS_IS_WINDOWS AND LLVM_LIBC_FULL_BUILD)
   message(FATAL_ERROR "Windows does not support full mode build.")
 endif ()
 
+if (LIBC_TARGET_OS_IS_LINUX)
----------------
nickdesaulniers wrote:

It's worthwhile to consider this.  Let me jot down my thoughts.

Doing so would effectively change the implicit default value of `LIBC_USE_HOST_KERNEL_HEADERS` based on whether we were doing full build (off) vs overlay build (on).

Two invariants I'd like to maintain:
1. In the spirit of #122376, I would like us to consider overlay builds the "more advanced" of the build configurations relative to full builds (due to the ABI compatibility concerns).  So if we require folks doing overlay builds to add one more cmake flag, in this case `-DLIBC_USE_HOST_KERNEL_HEADERS=ON`, then I don't feel too bad.
2. We encourage folks targeting linux to just build the kernel headers; it's pretty straightforward.  If they can't do this, then forcing them to admit to bad behavior via another cmake var is somewhat part of the intent of this PR.  Perhaps that's being malevolent.  Implying `LIBC_USE_HOST_KERNEL_HEADERS` for overlay builds removes the point that folks own up to looking at /usr/include. (Besides, it hurts our hermeticity; once we add /usr/include to the directory search path, we MAY end up including other non-kernel headers since kernel and non-kernel headers are all mixed together in /usr/include. This alone gives me heartburn).

Thinking through the case of wanting to do an overlay build, is there ever a case where we'd want to use newer kernel headers (or perhaps older kernel headers) than what was available on the host?  A hypothetical use case for newer headers: you want to build on one host (that has old headers) but support newer syscalls. The result can only run on other machines with newer kernels (when newer syscalls are made at runtime).  A hypothetical use case for older headers: you want to build on one host (that has newer headers), but run the libc on another target that is running an older kernel.  Both of those hypotheticals seem to be a stretch in my mind. Perhaps overlay mode should (or does) imply "literally the result will only run on my host machine+kernel?"

---
An inspiration for `LIBC_USE_HOST_KERNEL_HEADERS` is `-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang` which _was_ a flag in clang that was necessary to use `-ftrivial-auto-var-init=` for some time.  Basically, it was intended as a way to not outright block people from using `-ftrivial-auto-var-init=`, but make it such that there was one more hoop for them to jump through to use it.

In the same vein, I don't want to outright prevent people from looking at `/usr/include` for linux kernel headers (indeed, maybe on the buildbots we'll skip building kernel headers for each run), but I would like to gradually move us away from doing so.

By adding `LIBC_USE_HOST_KERNEL_HEADERS`, I think that strikes the right balance where folks (and build bots) can transition to `LIBC_KERNEL_HEADERS` over time, rather than forcing them to do so right now (and not providing any option otherwise).

---
So I guess, yes, it would make sense to automatically set it. But that betrays an intent in this PR to add one more hoop for folks who refuse to just build the kernel headers from source to sign on the X that they are potentially including more than just kernel headers when they add /usr/include to the compiler's header search path.  Is that being too malicious?

https://github.com/llvm/llvm-project/pull/123820