[PATCH] D119296: KCFI sanitizer

Tue Jun 7 09:43:37 PDT 2022

samitolvanen added a comment.

In D119296#3562409 <https://reviews.llvm.org/D119296#3562409>, @MaskRay wrote:

> a) Naming
>
> About the 'k' prefix: this is generic and does not need to be coupled with "kernel",
> but perhaps an argument can be made that the 'k' does not need to refer to "kernel".

This scheme was specifically designed for the kernel, hence the name. Clang has other CFI schemes for user space, and while this might be useful for other low-level software, everyone else is probably better off using `-fsanitize=cfi`. I'm always happy to hear ideas for improving naming, of course!

> c) inliner
>
> Add this change to enabling inlining

Good catch, I'll fix that in the next version.

> d) x86-64 scheme
>
> Why is the following scheme picked? How are the double-int3 before and after movl useful?

This preamble format was specifically requested by x86 kernel maintainers to avoid special casing objtool and to accommodate runtime patching. I'll add a comment to explain this.

> e) `__kcfi_typeid_*`
>
> This deserves a better description in the summary.

Sure, I'll add better description.

> Is it useful to suppress the typeid symbol for `_Z` (if some C++ projects want to use this feature)? A mangled name has encoded the type information.

It's not, we still need a symbol that can be referenced directly in assembly code without having to compute the actual hash.

> IIUC the current scheme only works when two TUs have address-taken declarations of the same name but with different signatures,
> and done by a linker warning.

Yes, this will build, but an indirect call from the TU with an incorrect declaration will result in a runtime error.

> ELF linkers don't error for two `SHN_ABS` `STB_GLOBAL` symbols of the same `st_value`.
> When the `st_value` fields differ, there will be a diagnostic. If needed, the specialized diagnostic can be added there.

OK, so we could just not make the symbols weak and end up failing at link time if there's a mismatch. That sounds reasonable to me.

> But I'd prefer the lld patch to be separate and be properly tested (I add myself as a reviewer for lld/ELF changes to catch possibly unintended usage, sorry!)

Sure, I'll drop the lld change from this patch and we can improve the error message later. I'll address the remaining inline comments you had in the next version as well. Thanks for taking a look!

================
Comment at: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1345
+    OutStreamer->emitLabel(Loc);
+    OutStreamer->emitAbsoluteSymbolDiff(Symbol, Loc, 4);
+
----------------
MaskRay wrote:
> Use getCodePointerSIze().
> 
> A 4-byte PC-relative relocation is insufficient if text and data are more than 31-bit away.
> Use getCodePointerSIze().
> 
> A 4-byte PC-relative relocation is insufficient if text and data are more than 31-bit away.

I did use that earlier, but the kernel maintainers specifically requested that I use a 32-bit offset instead to reduce memory overhead. The kernel uses this scheme also for static call addresses, so the 31-bit limit doesn't seem to be a concern in practise.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296