[llvm] 82af950 - [X86] Enable ibt-seal optimization when LTO is used in Kernel

Phoebe Wang via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 20 18:55:43 PST 2022


Author: Joao Moreira
Date: 2022-01-21T10:55:34+08:00
New Revision: 82af95029ec947fed8b9c516f04d4f217bd87930

URL: https://github.com/llvm/llvm-project/commit/82af95029ec947fed8b9c516f04d4f217bd87930
DIFF: https://github.com/llvm/llvm-project/commit/82af95029ec947fed8b9c516f04d4f217bd87930.diff

LOG: [X86] Enable ibt-seal optimization when LTO is used in Kernel

Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function.

Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1].

This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage.

A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540.

The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged.

[1] - https://lkml.org/lkml/2021/11/22/591

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D116070

Added: 
    llvm/test/CodeGen/X86/ibtseal-kernel.ll
    llvm/test/CodeGen/X86/ibtseal-large.ll
    llvm/test/CodeGen/X86/ibtseal-small.ll

Modified: 
    clang/include/clang/Basic/CodeGenOptions.def
    clang/include/clang/Driver/Options.td
    clang/lib/CodeGen/CodeGenModule.cpp
    clang/lib/Driver/ToolChains/Clang.cpp
    clang/lib/Frontend/CompilerInvocation.cpp
    llvm/lib/Target/X86/X86IndirectBranchTracking.cpp

Removed: 
    


################################################################################
diff  --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 3526b8a4a9044..0da875525c0c4 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -107,6 +107,8 @@ CODEGENOPT(CFProtectionReturn , 1, 0) ///< if -fcf-protection is
                                       ///< set to full or return.
 CODEGENOPT(CFProtectionBranch , 1, 0) ///< if -fcf-protection is
                                       ///< set to full or branch.
+CODEGENOPT(IBTSeal, 1, 0)             ///< set to optimize CFProtectionBranch.
+
 CODEGENOPT(XRayInstrumentFunctions , 1, 0) ///< Set when -fxray-instrument is
                                            ///< enabled.
 CODEGENOPT(StackSizeSection  , 1, 0) ///< Set when -fstack-size-section is enabled.

diff  --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index b66363b1d3e92..49ceebcb51cf5 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1927,6 +1927,8 @@ def fcf_protection_EQ : Joined<["-"], "fcf-protection=">, Flags<[CoreOption, CC1
 def fcf_protection : Flag<["-"], "fcf-protection">, Group<f_Group>, Flags<[CoreOption, CC1Option]>,
   Alias<fcf_protection_EQ>, AliasArgs<["full"]>,
   HelpText<"Enable cf-protection in 'full' mode">;
+def mibt_seal : Flag<["-"], "mibt-seal">, Group<m_Group>, Flags<[CoreOption, CC1Option]>,
+  HelpText<"Optimize fcf-protection=branch/full (requires LTO).">;
 
 defm xray_instrument : BoolFOption<"xray-instrument",
   LangOpts<"XRayInstrument">, DefaultFalse,

diff  --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index e91da73d2f03c..d534cf182f5a7 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -712,6 +712,9 @@ void CodeGenModule::Release() {
                               1);
   }
 
+  if (CodeGenOpts.IBTSeal)
+    getModule().addModuleFlag(llvm::Module::Override, "ibt-seal", 1);
+
   // Add module metadata for return address signing (ignoring
   // non-leaf/all) and stack tagging. These are actually turned on by function
   // attributes, but we use module metadata to emit build attributes. This is

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index dd2570132ddf7..52d576345c027 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6166,6 +6166,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
         Args.MakeArgString(Twine("-fcf-protection=") + A->getValue()));
   }
 
+  if (IsUsingLTO)
+    Args.AddLastArg(CmdArgs, options::OPT_mibt_seal);
+
   // Forward -f options with positive and negative forms; we translate these by
   // hand.  Do not propagate PGO options to the GPU-side compilations as the
   // profile info is for the host-side compilation only.

diff  --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index 7727d70adfb1e..eaca1fbb9def8 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -1814,6 +1814,9 @@ bool CompilerInvocation::ParseCodeGenArgs(CodeGenOptions &Opts, ArgList &Args,
       Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Name;
   }
 
+  if (Opts.PrepareForLTO && Args.hasArg(OPT_mibt_seal))
+    Opts.IBTSeal = 1;
+
   for (auto *A :
        Args.filtered(OPT_mlink_bitcode_file, OPT_mlink_builtin_bitcode)) {
     CodeGenOptions::BitcodeFileToLink F;

diff  --git a/llvm/lib/Target/X86/X86IndirectBranchTracking.cpp b/llvm/lib/Target/X86/X86IndirectBranchTracking.cpp
index 6642f46e64b2f..7e751a4c8811e 100644
--- a/llvm/lib/Target/X86/X86IndirectBranchTracking.cpp
+++ b/llvm/lib/Target/X86/X86IndirectBranchTracking.cpp
@@ -95,14 +95,45 @@ static bool IsCallReturnTwice(llvm::MachineOperand &MOp) {
   return Attrs.hasFnAttr(Attribute::ReturnsTwice);
 }
 
+// Checks if function should have an ENDBR in its prologue
+static bool needsPrologueENDBR(MachineFunction &MF, const Module *M) {
+  Function &F = MF.getFunction();
+
+  if (F.doesNoCfCheck())
+    return false;
+
+  const X86TargetMachine *TM =
+      static_cast<const X86TargetMachine *>(&MF.getTarget());
+  Metadata *IBTSeal = M->getModuleFlag("ibt-seal");
+
+  switch (TM->getCodeModel()) {
+  // Large code model functions always reachable through indirect calls.
+  case CodeModel::Large:
+    return true;
+  // Only address taken functions in LTO'ed kernel are reachable indirectly.
+  // IBTSeal implies LTO, thus only check if function is address taken.
+  case CodeModel::Kernel:
+    // Check if ibt-seal was enabled (implies LTO is being used).
+    if (IBTSeal) {
+      return F.hasAddressTaken();
+    }
+    // if !IBTSeal, fall into default case.
+    LLVM_FALLTHROUGH;
+  // Address taken or externally linked functions may be reachable.
+  default:
+    return (F.hasAddressTaken() || !F.hasLocalLinkage());
+  }
+}
+
 bool X86IndirectBranchTrackingPass::runOnMachineFunction(MachineFunction &MF) {
   const X86Subtarget &SubTarget = MF.getSubtarget<X86Subtarget>();
 
+  const Module *M = MF.getMMI().getModule();
   // Check that the cf-protection-branch is enabled.
-  Metadata *isCFProtectionSupported =
-      MF.getMMI().getModule()->getModuleFlag("cf-protection-branch");
-  // NB: We need to enable IBT in jitted code if JIT compiler is CET
-  // enabled.
+  Metadata *isCFProtectionSupported = M->getModuleFlag("cf-protection-branch");
+
+  //  NB: We need to enable IBT in jitted code if JIT compiler is CET
+  //  enabled.
   const X86TargetMachine *TM =
       static_cast<const X86TargetMachine *>(&MF.getTarget());
 #ifdef __CET__
@@ -119,13 +150,8 @@ bool X86IndirectBranchTrackingPass::runOnMachineFunction(MachineFunction &MF) {
   TII = SubTarget.getInstrInfo();
   EndbrOpcode = SubTarget.is64Bit() ? X86::ENDBR64 : X86::ENDBR32;
 
-  // Large code model, non-internal function or function whose address
-  // was taken, can be accessed through indirect calls. Mark the first
-  // BB with ENDBR instruction unless nocf_check attribute is used.
-  if ((TM->getCodeModel() == CodeModel::Large ||
-       MF.getFunction().hasAddressTaken() ||
-       !MF.getFunction().hasLocalLinkage()) &&
-      !MF.getFunction().doesNoCfCheck()) {
+  // If function is reachable indirectly, mark the first BB with ENDBR.
+  if (needsPrologueENDBR(MF, M)) {
     auto MBB = MF.begin();
     Changed |= addENDBR(*MBB, MBB->begin());
   }

diff  --git a/llvm/test/CodeGen/X86/ibtseal-kernel.ll b/llvm/test/CodeGen/X86/ibtseal-kernel.ll
new file mode 100644
index 0000000000000..eb98515f01f81
--- /dev/null
+++ b/llvm/test/CodeGen/X86/ibtseal-kernel.ll
@@ -0,0 +1,19 @@
+; RUN: llc < %s -O2 -mtriple=x86_64-unknown-linux-gnu -x86-indirect-branch-tracking --code-model=kernel | FileCheck %s --check-prefix=CHECK-KERNEL-IBTSEAL
+
+; CHECK-KERNEL-IBTSEAL: foo:
+; CHECK-KERNEL-IBTSEAL: endbr
+; CHECK-KERNEL-IBTSEAL: bar:
+; CHECK-KERNEL-IBTSEAL-NOT: endbr
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo() {
+  ret void
+}
+
+define dso_local i8* @bar() {
+  ret i8* bitcast (void ()* @foo to i8*)
+}
+
+!llvm.module.flags = !{!1}
+!1 = !{i32 4, !"ibt-seal", i32 1}

diff  --git a/llvm/test/CodeGen/X86/ibtseal-large.ll b/llvm/test/CodeGen/X86/ibtseal-large.ll
new file mode 100644
index 0000000000000..e48ac8eb19a7b
--- /dev/null
+++ b/llvm/test/CodeGen/X86/ibtseal-large.ll
@@ -0,0 +1,19 @@
+; RUN: llc < %s -O2 -mtriple=x86_64-unknown-linux-gnu -x86-indirect-branch-tracking --code-model=large | FileCheck %s --check-prefix=CHECK-LARGE-IBTSEAL
+
+; CHECK-LARGE-IBTSEAL: foo:
+; CHECK-LARGE-IBTSEAL: endbr
+; CHECK-LARGE-IBTSEAL: bar:
+; CHECK-LARGE-IBTSEAL: endbr
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo() {
+  ret void
+}
+
+define dso_local i8* @bar() {
+  ret i8* bitcast (void ()* @foo to i8*)
+}
+
+!llvm.module.flags = !{!1}
+!1 = !{i32 4, !"ibt-seal", i32 1}

diff  --git a/llvm/test/CodeGen/X86/ibtseal-small.ll b/llvm/test/CodeGen/X86/ibtseal-small.ll
new file mode 100644
index 0000000000000..3d810d89a26c6
--- /dev/null
+++ b/llvm/test/CodeGen/X86/ibtseal-small.ll
@@ -0,0 +1,19 @@
+; RUN: llc < %s -O2 -mtriple=x86_64-unknown-linux-gnu -x86-indirect-branch-tracking --code-model=small | FileCheck %s --check-prefix=CHECK-SMALL-IBTSEAL
+
+; CHECK-SMALL-IBTSEAL: foo:
+; CHECK-SMALL-IBTSEAL: endbr
+; CHECK-SMALL-IBTSEAL: bar:
+; CHECK-SMALL-IBTSEAL: endbr
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo() {
+  ret void
+}
+
+define dso_local i8* @bar() {
+  ret i8* bitcast (void ()* @foo to i8*)
+}
+
+!llvm.module.flags = !{!1}
+!1 = !{i32 4, !"ibt-seal", i32 1}


        


More information about the llvm-commits mailing list