[PATCH] D119296: KCFI sanitizer

Nick Desaulniers via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Apr 20 11:10:19 PDT 2022

nickdesaulniers added inline comments.

Comment at: clang/lib/CodeGen/CodeGenModule.cpp:553-554
+  if (LangOpts.Sanitize.has(SanitizerKind::KCFI)) {
+    EmitKCFIConstants();
+    ClearUnusedKCFIPrefixes();
+  }
This is the only call site for these two methods, and they both loop over every function in the module. Can/should we merge the methods in order to fuse the loops?

Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2248
+  F->setPrefixData(CreateKCFITypeId(FD->getType()));
+  F->addFnAttr("kcfi");
While string based attributes are easier to work with in LLVM, I wonder if this should be made into an actual keyword.  This involves some boilerplate additions to:
- llvm/include/llvm/AsmParser/LLToken.h
- llvm/include/llvm/Bitcode/LLVMBitCodes.h
- llvm/include/llvm/IR/Attributes.td
- llvm/lib/AsmParser/LLLexer.cpp
- llvm/lib/Bitcode/Reader/BitcodeReader.cpp
- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
- llvm/lib/Transforms/Utils/CodeExtractor.cpp
- llvm/test/Bitcode/attributes.ll
- llvm/include/llvm/IR/Attributes.td

I haven't seen guidance as to which is preferred or what the tradeoffs are. These string attrs are way less boilerplate, but not seeing support in CodeExtractor makes me slightly dubious. Let me ask on LLVM's discourse (or open question to other reviewers).

Also, this reminds me; isn't there a fn attr we use today in the kernel to blanket say "don't instrument this?" I wonder if that needs to be updated to know about disabling CFI, if I'm remembering correctly and that is necessary. I'm thinking of `noinstr` in the Linux kernel, which looks like it sets a bunch of things.

Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2255-2260
+  for (const char &C : Name) {
+    if (llvm::isAlnum(C) || C == '_' || C == '.')
+      continue;
+    return false;
+  }
+  return true;
Is this more concise using `llvm::all_of` from llvm/ADT/STLExtras.h?

Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2275-2276
+    std::string Name = F.getName().str();
+    if (!allowKCFIIdentifier(Name))
+      continue;
Maybe `allowKCFIIdentifier` should accept a `StringRef` rather than `std::string`? or would that break the for each char loop?

Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2280-2281
+    Name = "__kcfi_typeid_" + Name;
+    std::string Asm = ".weak " + Name + "\n" + ".set " + Name + ", " +
+                      std::to_string(Id->getZExtValue()) + "\n";
+    M.appendModuleInlineAsm(Asm);
Prefer llvm::Twine for building up strings via concatenation efficiently.

Comment at: clang/test/CodeGen/kcfi.c:43
+         call(f3) +
+         call(f4);
samitolvanen wrote:
> nickdesaulniers wrote:
> > How about a direct call to `f5`? That should have no hash, right? Perhaps test that `test` has no  hash as well?
> `test` is a global function, which needs to have a hash as it could be indirectly called from another module.
> Note that we don't need hashes in non-address-taken local functions, but the initial patch emitted them anyway. I changed the code to drop those hashes and added a directly called `static f5(void)` to the test.
Oh, that's nice!

Comment at: llvm/include/llvm/MC/MCObjectFileInfo.h:359
+  MCSection *getKCFISection(const MCSection &TextSec,
+                            const std::string &Name) const;
nickdesaulniers wrote:
> can we take a StringRef instead (and include the right header)?
and include the right headers for both parameters?

  rG LLVM Github Monorepo



More information about the cfe-commits mailing list