[PATCH] D73985: [bpf] zero extension is required in BPF implementaiton so remove <<=32 >>=32

Tue Feb 4 11:44:43 PST 2020

jrfastab created this revision.
jrfastab added reviewers: ast, yonghong-song.
jrfastab added a project: LLVM.
Herald added subscribers: llvm-commits, hiraditya.

  The current pattern matching for zext results in the following code snippet
  being produced,
  
    w1 = w0
    r1 <<= 32
    r1 >>= 32
  
  Because BPF implementations require zero extension on 32bit loads this
  both adds a few extra unneeded instructions but also makes it a bit
  harder for the verifier to track the r1 register bounds. For example in
  this verifier trace we see at the end of the snippet R2 offset is unknown.
  However, if we track this correctly we see w1 should have the same bounds
  as r8. R8 smax is less than U32 max value so a zero extend load should keep
  the same value. Adding a max value of 800 (R8=inv(id=0,smax_value=800)) to
  an off=0, as seen in R7 should create a max offset of 800. However at the
  end of the snippet we note the R2 max offset is 0xffffFFFF.
  
    R0=inv(id=0,smax_value=800)
    R1_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff))
    R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0)
    R8_w=inv(id=0,smax_value=800,umax_value=4294967295,var_off=(0x0; 0xffffffff))
    R9=inv800 R10=fp0 fp-8=mmmm????
   58: (1c) w9 -= w8
   59: (bc) w1 = w8
   60: (67) r1 <<= 32
   61: (77) r1 >>= 32
   62: (bf) r2 = r7
   63: (0f) r2 += r1
   64: (bf) r1 = r6
   65: (bc) w3 = w9
   66: (b7) r4 = 0
   67: (85) call bpf_get_stack#67
    R0=inv(id=0,smax_value=800)
    R1_w=ctx(id=0,off=0,imm=0)
    R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=4294967295,var_off=(0x0; 0xffffffff))
    R3_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff))
    R4_w=inv0 R6=ctx(id=0,off=0,imm=0)
    R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0)
    R8_w=inv(id=0,smax_value=800,umax_value=4294967295,var_off=(0x0; 0xffffffff))
    R9_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff))
    R10=fp0 fp-8=mmmm????
  
  After this patch R1 bounds are not smashed by the <<=32 >>=32 shift and we
  get correct bounds on R2 umax_value=800.
  
  Further it reduces 3 insns to 1.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D73985

Files:
  llvm/lib/Target/BPF/BPFISelLowering.cpp
  llvm/lib/Target/BPF/BPFInstrInfo.td


Index: llvm/lib/Target/BPF/BPFInstrInfo.td
===================================================================

--- llvm/lib/Target/BPF/BPFInstrInfo.td
+++ llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -732,8 +732,7 @@
 def : Pat<(i64 (sext GPR32:$src)),
           (SRA_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;
 
-def : Pat<(i64 (zext GPR32:$src)),
-          (SRL_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;
+def : Pat<(i64 (zext GPR32:$src)), (MOV_32_64 GPR32:$src)>;
 
 // For i64 -> i32 truncation, use the 32-bit subregister directly.
 def : Pat<(i32 (trunc GPR:$src)),
Index: llvm/lib/Target/BPF/BPFISelLowering.cpp
===================================================================
--- llvm/lib/Target/BPF/BPFISelLowering.cpp
+++ llvm/lib/Target/BPF/BPFISelLowering.cpp
@@ -570,6 +570,12 @@
   DebugLoc DL = MI.getDebugLoc();
 
   MachineRegisterInfo &RegInfo = F->getRegInfo();
+
+  if (!isSigned) {
+    Register PromotedReg0 = RegInfo.createVirtualRegister(RC);
+    BuildMI(BB, DL, TII.get(BPF::MOV_32_64), PromotedReg0).addReg(Reg);
+    return PromotedReg0;
+  }
   Register PromotedReg0 = RegInfo.createVirtualRegister(RC);
   Register PromotedReg1 = RegInfo.createVirtualRegister(RC);
   Register PromotedReg2 = RegInfo.createVirtualRegister(RC);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D73985.242386.patch
Type: text/x-patch
Size: 1263 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200204/42846f54/attachment.bin>