[PATCH] D88525: [RFC] BPF: use Source instead of ILP scheduler for selection dag

Yonghong Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 29 21:02:39 PDT 2020


yonghong-song created this revision.
yonghong-song added a reviewer: ast.
Herald added subscribers: llvm-commits, hiraditya.
Herald added a project: LLVM.
yonghong-song requested review of this revision.

NOTE: I put a RFC tag as there are two issues to be resolved: (1). whether we have a better approach, e.g., fix in selectiondag instead of BPF backend? (2). currently selftest `test_progs-no_alu32 -n 7/4` (test_verif_scale3.o) failed as processed insns exceed kernel limit, which needs investigation.

This is to fix a bug reported by

  https://github.com/iovisor/bpftrace/issues/1305

For the following initial selection dag,

  t79: ch,glue = BPFISD::CALL t78, Constant:i64<4>, Register:i64 $r1, Register:i64 $r2, Register:i64 $r3, t78:1
  t80: ch,glue = callseq_end t79, TargetConstant:i64<0>, TargetConstant:i64<0>, t79:1 
    t81: i64,ch,glue = CopyFromReg t80, Register:i64 $r0, t80:1
  t82: i64,ch = load<(dereferenceable load 8 from %ir."struct cgroup.kn")> t81:1, FrameIndex:i64<2>, undef:i64
            t83: ch = lifetime.end<0 to -1> t82:1, TargetFrameIndex:i64<2>
          t86: ch = lifetime.start<0 to -1> t83, TargetFrameIndex:i64<1> 
        t89: ch = store<(store 8 into %ir.20)> t86, Constant:i64<0>, FrameIndex:i64<1>, undef:i64
      t91: ch = lifetime.start<0 to -1> t89, TargetFrameIndex:i64<0>
    t92: ch,glue = callseq_start t91, TargetConstant:i64<0>, TargetConstant:i64<0>
  t93: ch,glue = CopyToReg t92, Register:i64 $r1, FrameIndex:i64<0>
  t94: ch,glue = CopyToReg t93, Register:i64 $r2, Constant:i64<8>, t93:1
    t87: i64 = add t82, Constant:i64<8>
  t95: ch,glue = CopyToReg t94, Register:i64 $r3, t87, t94:1
  t96: ch,glue = BPFISD::CALL t95, Constant:i64<4>, Register:i64 $r1, Register:i64 $r2, Register:i64 $r3, t95:1
  t97: ch,glue = callseq_end t96, TargetConstant:i64<0>, TargetConstant:i64<0>, t96:1

Note that node t89 depends on t86 which recursively depends on t82. This will enforce
load happens before store.

The optimized dag becomes

  t79: ch,glue = BPFISD::CALL t78, Constant:i64<4>, Register:i64 $r1, Register:i64 $r2, Register:i64 $r3, t78:1
  t80: ch,glue = callseq_end t79, TargetConstant:i64<0>, TargetConstant:i64<0>, t79:1
                t81: i64,ch,glue = CopyFromReg t80, Register:i64 $r0, t80:1
              t131: ch = TokenFactor t81:1, t130:1
            t83: ch = lifetime.end<0 to -1> t131, TargetFrameIndex:i64<2>
          t86: ch = lifetime.start<0 to -1> t83, TargetFrameIndex:i64<1>
          t128: ch = store<(store 8 into %ir.20)> t80, Constant:i64<0>, FrameIndex:i64<1>, undef:i64
        t129: ch = TokenFactor t86, t128
      t91: ch = lifetime.start<0 to -1> t129, TargetFrameIndex:i64<0>
    t92: ch,glue = callseq_start t91, TargetConstant:i64<0>, TargetConstant:i64<0>
  t93: ch,glue = CopyToReg t92, Register:i64 $r1, FrameIndex:i64<0>
  t94: ch,glue = CopyToReg t93, Register:i64 $r2, Constant:i64<8>, t93:1
    t87: i64 = add t130, Constant:i64<8>

...

  t130: i64,ch = load<(dereferenceable load 8 from %ir."struct cgroup.kn")> t80, FrameIndex:i64<2>, undef:i64

For the optimized tag, store t128 now depends on t80 and load t130 also
depends on t80. Depending on how schedule dag scheduler works,
this opens possibility that load and store may be reordered.

Note the above optimized dag code is generated for both bpf and x86.
But x86 actually generates correct code. The reason is that x86 (and many other
architectures) is using "Source" selection dag scheduler which favors
source order in case of multiple choices. The default for bpf is ILP 
which tries to optimize for instruction parallelism.

I disabled "Source" scheduler for x86 and used "ILP" scheduler,
the generated code still correct. This is because x86 has different lowering
code and it happens to work. Tweaking "ILP" might not be a robust idea
for BPF, so this patch enabled BPF to use "Source" scheduler, the same as
most, if not all, other architectures.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D88525

Files:
  llvm/lib/Target/BPF/BPFSubtarget.h
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1-bpfeb.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll
  llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll
  llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll
  llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2-bpfeb.ll
  llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll
  llvm/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll
  llvm/test/CodeGen/BPF/intrinsics.ll
  llvm/test/CodeGen/BPF/lifetime.ll
  llvm/test/CodeGen/BPF/objdump_intrinsics.ll
  llvm/test/CodeGen/BPF/objdump_nop.ll
  llvm/test/CodeGen/BPF/remove_truncate_3.ll
  llvm/test/CodeGen/BPF/rodata_5.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D88525.295168.patch
Type: text/x-patch
Size: 20917 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200930/a93d8ea9/attachment.bin>


More information about the llvm-commits mailing list