[PATCH] D61524: [BPF] Support for compile once and run everywhere

Yonghong Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 3 10:59:32 PDT 2019


yonghong-song created this revision.
yonghong-song added reviewers: ast, JiongWang.
Herald added subscribers: llvm-commits, mgorny.
Herald added a project: LLVM.

Introduction
============

This patch added intial support for bpf program compile once
and run everywhere (CO-RE).

The main motivation is for bpf program which depends on
kernel headers which may vary between different kernel versions.
The initial discussion can be found at https://lwn.net/Articles/773198/.

Currently, bpf program accesses kernel internal data structure
through bpf_probe_read() helper. The idea is to capture the
kernel data structure to be accessed through bpf_probe_read()
and relocate them on different kernel versions.

On each host, right before bpf program load, the bpfloader
will look at the types of the native linux through vmlinux BTF,
calculates proper access offset and patch the instruction.

To accommodate this, the patch did the following:

  . An IR pass is added to convert getelementptr to
    global variable who name encodes the getelementptr
    access pattern.
  . An SimplifyPatchable MachineInstruction pass is added
    to remove unnecessary loads.
  . The BTF output pass is enhanced to generate relocation
    records located in .BTF.ext section.

Typical CO-RE also needs support of global variables which can
be assigned to different values to different hosts. For example,
kernel version can be used to guard different versions of codes.
This patch added the support for patchable externals as well.

Example
=======

The following is an example.

  struct pt_regs {
    long arg1;
    long arg2;
  };
  struct sk_buff {
    int i;
    struct net_device *dev;
  };
  
  static int (*bpf_probe_read)(void *dst, int size, void *unsafe_ptr) =
          (void *) 4;
  extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version;
  int bpf_prog(struct pt_regs *ctx) {
    struct net_device *dev = 0;
  
    // ctx->arg* does not need bpf_probe_read
    if (__kernel_version >= 41608)
      bpf_probe_read(&dev, sizeof(dev), &((struct sk_buff *)ctx->arg1)->dev);
    else
      bpf_probe_read(&dev, sizeof(dev), &((struct sk_buff *)ctx->arg2)->dev);
    return dev != 0;
  } 
      

In the above, we want to translate the third argument of
bpf_probe_read() as relocations.

  -bash-4.4$ clang -target bpf -O2 -g -S -emit-llvm trace.c
  -bash-4.4$ llc -march=bpf -filetype=asm -mattr=offsetreloc trace.o

The compiler will generate two new subsections in .BTF.ext,
OffsetReloc and ExternReloc.
OffsetReloc is to record the structure member offset operations,
and ExternalReloc is to record the external globals where
only u8, u16, u32 and u64 are supported.

   BPFOffsetReloc Size
   struct SecLOffsetReloc for ELF section #1
   A number of struct BPFOffsetReloc for ELF section #1
   struct SecOffsetReloc for ELF section #2
   A number of struct BPFOffsetReloc for ELF section #2
   ...
   BPFExternReloc Size
   struct SecExternReloc for ELF section #1
   A number of struct BPFExternReloc for ELF section #1
   struct SecExternReloc for ELF section #2
  
  struct BPFOffsetReloc {
    uint32_t InsnOffset;    ///< Byte offset in this section
    uint32_t TypeID;        ///< TypeID for the relocation
    uint32_t OffsetNameOff; ///< The string to traverse types
    uint32_t Dependency;    ///< Depending on another BPFOffsetReloc
  };
  
  struct BPFExternReloc {
    uint32_t InsnOffset;    ///< Byte offset in this section
    uint32_t ExternNameOff; ///< The string for external variable
  };

Note that only externs with attribute section ".BPF.patchable_externs"
are considered for Extern Reloc which will be patched by bpf loader
right before the load.

For the above test case, two offset records and one extern record
will be generated:

  OffsetReloc records:
        .long   .Ltmp12                 # Insn Offset
        .long   7                       # TypeId
        .long   239                     # Type Decode String
        .long   0                       # Dependency
        .long   .Ltmp18                 # Insn Offset
        .long   7                       # TypeId
        .long   239                     # Type Decode String
        .long   0                       # Dependency
  
  ExternReloc record:
        .long   .Ltmp5                  # Insn Offset
        .long   164                     # External Variable
  
  In string table:
        .ascii  "0:0"                   # string offset=239
        .ascii  "__kernel_version"      # string offset=164

The offset can be calculated as:

  size of 0 types + the 2nd member offset (0 representing the 1st member) of the structure.

The asm code:

  .Ltmp5: 
  .Ltmp6: 
          r2 = 0
          r3 = 41608
  .Ltmp7: 
  .Ltmp8:
          .loc    1 22 7 is_stmt 0        # trace.c:22:7
  .Ltmp9: 
          if r3 > r2 goto LBB0_2
  .Ltmp10:
  .Ltmp11:
          .loc    1 0 7                   # trace.c:0:7
  .Ltmp12:
          r2 = 8
  .Ltmp13:
          .loc    1 23 64 is_stmt 1       # trace.c:23:64
  .Ltmp14:
  .Ltmp15:
          r3 = *(u64 *)(r1 + 0)
          goto LBB0_3
  .Ltmp16:
  .Ltmp17:
  LBB0_2:
          .loc    1 0 64 is_stmt 0        # trace.c:0:64
  .Ltmp18:
          r2 = 8
          .loc    1 25 64 is_stmt 1       # trace.c:25:64
  .Ltmp19:
          r3 = *(u64 *)(r1 + 8)
  .Ltmp20:
  .Ltmp21:
  LBB0_3:
          .loc    1 0 64 is_stmt 0        # trace.c:0:64
          r3 += r2
          r1 = r10
  .Ltmp22:
  .Ltmp23:
  .Ltmp24:
          r1 += -8
          r2 = 8
          call 4

For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number
8 is the structure offset based on the current BTF.
Loader needs to adjust it if it changes on the host.

For instruction .Ltmp5, "r2 = 0", the external variable
got a default value 0, loader needs to supply an appropriate
value for the particular host.

Compiling to generate object code and disassemble:

  0000000000000000 bpf_prog:
          0:       b7 02 00 00 00 00 00 00         r2 = 0
          1:       7b 2a f8 ff 00 00 00 00         *(u64 *)(r10 - 8) = r2
          2:       b7 02 00 00 00 00 00 00         r2 = 0
          3:       b7 03 00 00 88 a2 00 00         r3 = 41608
          4:       2d 23 03 00 00 00 00 00         if r3 > r2 goto +3 <LBB0_2>
          5:       b7 02 00 00 08 00 00 00         r2 = 8
          6:       79 13 00 00 00 00 00 00         r3 = *(u64 *)(r1 + 0)
          7:       05 00 02 00 00 00 00 00         goto +2 <LBB0_3>
  
   0000000000000040 LBB0_2:
          8:       b7 02 00 00 08 00 00 00         r2 = 8
          9:       79 13 08 00 00 00 00 00         r3 = *(u64 *)(r1 + 8)
  
   0000000000000050 LBB0_3:
         10:       0f 23 00 00 00 00 00 00         r3 += r2
         11:       bf a1 00 00 00 00 00 00         r1 = r10
         12:       07 01 00 00 f8 ff ff ff         r1 += -8
         13:       b7 02 00 00 08 00 00 00         r2 = 8
         14:       85 00 00 00 04 00 00 00         call 4

Instructions #2, #5 and #8 need relocation resoutions from the loader.

Default Action
==============

The offset relocation by default is off, it needs llc "-mattr=offsetreloc" flag
to turn it on. The patchable extern relocation is on by default with -g.

Caveat and Futher Work
======================

Currently, if getelementptr access indices like "0:0:0", the llvm will optimize
to directly use the based pointer. We may need a backend specific flag to
instrument the InstructionCombiner to avoid this optimization for bpf if
-mattr=offsetreloc is on.

In llvm IR, all user level unions and converted to structs and type casts.
Thus, from IR, the original union access pattern may get lost as the same
structurized-union access sequence may correspond to multiple possible
user-level accesses. For example,

  union {
    struct {
      int a;
      int b;
    } c;
    struct {
      int d;
      int e;
    } f;
    int g;
  } u;

u.g, u.c.a and u.f.d may have the same getelementptr, but they represent
different user level accesses. The CO-RE project uses vmlinux.h generated
by vmlinux as the kernel headers. It is possible we can work around
this issue by converting all unions to structs in vmlinuxBTF->vmlinux.h
conversion to workaround this issue.

The current implementation has really limited implementation for unions.

In the above OffsetReloc, the field "Dependency" is used to capture
the chain of getmemberptr, i.e., one getmemberptr is the base address
of another getmemberptr. This is mostly useful for union as one getmemberptr
result could be type casted as the base of another getmemberptr.
Sometimes, we also observed chained getmemberptrs may exist for
structures as well, esp, when both whole structure and the structure member
are used in bpf_probe_read().


Repository:
  rL LLVM

https://reviews.llvm.org/D61524

Files:
  lib/Target/BPF/BPF.h
  lib/Target/BPF/BPF.td
  lib/Target/BPF/BPFAbstrctMemberAccess.cpp
  lib/Target/BPF/BPFAsmPrinter.cpp
  lib/Target/BPF/BPFMISimplifyPatchable.cpp
  lib/Target/BPF/BPFSubtarget.cpp
  lib/Target/BPF/BPFSubtarget.h
  lib/Target/BPF/BPFTargetMachine.cpp
  lib/Target/BPF/BTF.h
  lib/Target/BPF/BTFDebug.cpp
  lib/Target/BPF/BTFDebug.h
  lib/Target/BPF/CMakeLists.txt
  test/CodeGen/BPF/BTF/binary-format.ll
  test/CodeGen/BPF/BTF/extern-global-var.ll
  test/CodeGen/BPF/BTF/filename.ll
  test/CodeGen/BPF/BTF/func-func-ptr.ll
  test/CodeGen/BPF/BTF/func-non-void.ll
  test/CodeGen/BPF/BTF/func-source.ll
  test/CodeGen/BPF/BTF/func-typedef.ll
  test/CodeGen/BPF/BTF/func-unused-arg.ll
  test/CodeGen/BPF/BTF/func-void.ll
  test/CodeGen/BPF/BTF/local-var.ll
  test/CodeGen/BPF/BTF/static-var-derived-type.ll
  test/CodeGen/BPF/BTF/static-var-inited-sec.ll
  test/CodeGen/BPF/BTF/static-var-inited.ll
  test/CodeGen/BPF/BTF/static-var-readonly-sec.ll
  test/CodeGen/BPF/BTF/static-var-readonly.ll
  test/CodeGen/BPF/BTF/static-var-sec.ll
  test/CodeGen/BPF/BTF/static-var-zerolen-array.ll
  test/CodeGen/BPF/BTF/static-var.ll
  test/CodeGen/BPF/CORE/offset-reloc-basic.ll
  test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll
  test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll
  test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll
  test/CodeGen/BPF/CORE/offset-reloc-union.ll
  test/CodeGen/BPF/CORE/patchable-extern-char.ll
  test/CodeGen/BPF/CORE/patchable-extern-uint.ll
  test/CodeGen/BPF/CORE/patchable-extern-ulonglong.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D61524.198045.patch
Type: text/x-patch
Size: 191479 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190503/d075a54e/attachment-0001.bin>


More information about the llvm-commits mailing list