[llvm] [BPF] Use 32-bit move for zero extension when possible (PR #77501)

via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 9 09:18:21 PST 2024


https://github.com/eddyz87 created https://github.com/llvm/llvm-project/pull/77501

When ALU32 is available, use 32-bit register to register assignment as a 32-bit to 64-bit zero extension.
Before this patch the following IR instructions:

```llvm
%a = and i64 %x, 4294967295
%a = zext i32 %x to i64
```

Were translated as a pair of shifts, e.g.:

```
r0 <<= 0x20
r0 >>= 0x20
```

Now these should be translated as a single assignment, e.g.:

```
w0 = w0
```

Which is shorter and more friendly for kernel verifier.

When tested on kernel BPF selftests there are changes in 38 out of 2127 BPF object files, all changes appear to follow expected pattern:

```diff
-     r_ <<= 0x20
-     r_ >>= 0x20
+     w_ = w_
```

>From c1533723453903e0a414e1b0d0efe8df11b43fa3 Mon Sep 17 00:00:00 2001
From: Eduard Zingerman <eddyz87 at gmail.com>
Date: Tue, 9 Jan 2024 17:45:04 +0200
Subject: [PATCH] [BPF] Use 32-bit move for zero extension when possible

When ALU32 is available, use 32-bit register to register assignment as
a 32-bit to 64-bit zero extension.
Before this patch the following IR instructions:

  %a = and i64 %x, 4294967295
  %a = zext i32 %x to i64

Were translated as a pair of shifts, e.g.:

  r0 <<= 0x20
  r0 >>= 0x20

Now these should be translated as a single assignment, e.g.:

  w0 = w0

Which is shorter and more friendly for kernel verifier.

When test on kernel BPF selftests there are changes in 38 out of 2127
BPF object files, all changes appear to follow expected pattern:

  -     r_ <<= 0x20
  -     r_ >>= 0x20
  +     w_ = w_
---
 llvm/lib/Target/BPF/BPFInstrInfo.td       | 17 ++++++++++--
 llvm/test/CodeGen/BPF/zext-upper-32bit.ll | 34 +++++++++++++++++++++++
 2 files changed, 48 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/BPF/zext-upper-32bit.ll

diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index 7d443a34490146..7a9445a1030d5a 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -728,9 +728,20 @@ let usesCustomInserter = 1, isCodeGenOnly = 1 in {
 // load 64-bit global addr into register
 def : Pat<(BPFWrapper tglobaladdr:$in), (LD_imm64 tglobaladdr:$in)>;
 
-// 0xffffFFFF doesn't fit into simm32, optimize common case
-def : Pat<(i64 (and (i64 GPR:$src), 0xffffFFFF)),
-          (SRL_ri (SLL_ri (i64 GPR:$src), 32), 32)>;
+// 0xffffFFFF doesn't fit into simm32, optimize common case.
+// Use sequence 'rX <<= 32; rX >>= 32;' if 32-bits ops are not available.
+let Predicates = [BPFNoALU32] in {
+  def : Pat<(i64 (and (i64 GPR:$src), 0xffffFFFF)),
+            (SRL_ri (SLL_ri (i64 GPR:$src), 32), 32)>;
+}
+// Use sequence 'wX = wX' if 32-bits ops are available.
+let Predicates = [BPFHasALU32] in {
+  def : Pat<(i64 (and (i64 GPR:$src), 0xffffFFFF)),
+            (INSERT_SUBREG
+              (i64 (IMPLICIT_DEF)),
+              (MOV_rr_32 (i32 (EXTRACT_SUBREG GPR:$src, sub_32))),
+              sub_32)>;
+}
 
 // Calls
 def : Pat<(BPFcall tglobaladdr:$dst), (JAL tglobaladdr:$dst)>;
diff --git a/llvm/test/CodeGen/BPF/zext-upper-32bit.ll b/llvm/test/CodeGen/BPF/zext-upper-32bit.ll
new file mode 100644
index 00000000000000..553765c8f9d310
--- /dev/null
+++ b/llvm/test/CodeGen/BPF/zext-upper-32bit.ll
@@ -0,0 +1,34 @@
+; RUN: llc -march=bpfel -mcpu=v3 --filetype=obj < %s | llvm-objdump -d - \
+; RUN: | FileCheck --check-prefix=ALU32 %s
+; RUN: llc -march=bpfel -mcpu=v2 --filetype=obj < %s | llvm-objdump -d - \
+; RUN: | FileCheck --check-prefix=NOALU32 %s
+
+define dso_local i64 @test1(i64 %x) {
+entry:
+  %a = and i64 %x, 4294967295
+  ret i64 %a
+}
+; ALU32:      <test1>:
+; ALU32-NEXT: w0 = w1
+; ALU32-NEXT: exit
+
+; NOALU32:      <test1>:
+; NOALU32-NEXT: r0 = r1
+; NOALU32-NEXT: r0 <<= 0x20
+; NOALU32-NEXT: r0 >>= 0x20
+; NOALU32-NEXT: exit
+
+define dso_local i64 @test2(i32 %x) {
+entry:
+  %a = zext i32 %x to i64
+  ret i64 %a
+}
+; ALU32:      <test2>:
+; ALU32-NEXT: w0 = w1
+; ALU32-NEXT: exit
+
+; NOALU32:      <test2>:
+; NOALU32-NEXT: r0 = r1
+; NOALU32-NEXT: r0 <<= 0x20
+; NOALU32-NEXT: r0 >>= 0x20
+; NOALU32-NEXT: exit



More information about the llvm-commits mailing list