[PATCH] D140804: [BPF] support for BPF_ST instruction in codegen
Eduard Zingerman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 4 09:08:05 PDT 2023
eddyz87 created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
eddyz87 updated this revision to Diff 545494.
eddyz87 added a comment.
eddyz87 updated this revision to Diff 547064.
eddyz87 retitled this revision from "BPF: support for BPF_ST instruction in codegen" to "[BPF] support for BPF_ST instruction in codegen".
eddyz87 edited the summary of this revision.
eddyz87 updated this revision to Diff 547186.
eddyz87 published this revision for review.
eddyz87 added a reviewer: yonghong-song.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
Generate BPF_ST when cpuv4 is specified.
eddyz87 added a comment.
BPFMISimplifyPatchable::checkADDrr for BPF_ST instructions.
eddyz87 added a comment.
Rebase
eddyz87 added a comment.
Hi Yonghong,
Could you please take a look at this revision? It enables generation of BPF_ST instruction when CPUv4 is selected.
When enabled the following kernel BPF selftests fail:
- log_fixup/missing_map
- spin_lock/lock_id_mapval_preserve
- spin_lock/lock_id_innermapval_preserve
All failures are caused by the difference in the expected log messages (when BPF_ST is enabled less instructions are generated => instruction numbers in the log a slightly off). I will submit kernel patch to relax log messages after this revision is accepted (but before landing it).
Impact basing on the kernel selftests:
- in total 653 *.bpf.o files are generated
- 377 are identical
- 265 have less instructions
- 2 have more instructions
Most of the changes are obvious: sequences like `r0 = 0; *(u64 *)(r10 - 8) = r0;` are replaced by a single instruction. For tests where the number of instructions increased I took a closer look:
- ip_check_defrag.bpf.o, # of insns increased from 58 to 63: a few more instructions are generated because of a difference in register allocation
- pyperf_subprogs.bpf.o, # of insns increased from 4421 to 4434: I can't pinpoint a stage when additional instructions are generated, it seems to accumulate due to slight difference in register allocation and spilling decisions.
Generate store immediate instruction when CPUv4 is enabled.
For example:
$ cat test.c
struct foo {
unsigned char b;
unsigned short h;
unsigned int w;
unsigned long d;
};
void bar(volatile struct foo *p) {
p->b = 1;
p->h = 2;
p->w = 3;
p->d = 4;
}
$ clang -O2 --target=bpf -mcpu=v4 test.c -c -o - | llvm-objdump -d -
...
0000000000000000 <bar>:
0: 72 01 00 00 01 00 00 00 *(u8 *)(r1 + 0x0) = 0x1
1: 6a 01 02 00 02 00 00 00 *(u16 *)(r1 + 0x2) = 0x2
2: 62 01 04 00 03 00 00 00 *(u32 *)(r1 + 0x4) = 0x3
3: 7a 01 08 00 04 00 00 00 *(u64 *)(r1 + 0x8) = 0x4
4: 95 00 00 00 00 00 00 00 exit
Take special care to:
- apply `BPFMISimplifyPatchable::checkADDrr` rewrite for BPF_ST
- validate immediate value when BPF_ST write is 64-bit: BPF interprets `(BPF_ST | BPF_MEM | BPF_DW)` writes as writes with sign extension. Thus it is fine to generate such write when immediate is -1, but it is incorrect to generate such write when immediate is +0xffff_ffff.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D140804
Files:
llvm/lib/Target/BPF/BPFInstrInfo.td
llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp
llvm/lib/Target/BPF/BPFSubtarget.cpp
llvm/lib/Target/BPF/BPFSubtarget.h
llvm/test/CodeGen/BPF/CORE/field-reloc-st-imm.ll
llvm/test/CodeGen/BPF/store_imm.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D140804.547186.patch
Type: text/x-patch
Size: 21102 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230804/d13620dd/attachment-0001.bin>
More information about the llvm-commits
mailing list