[llvm] [BPF] expand mem intrinsics (memcpy, memmove, memset) (PR #97648)

Thu Jul 4 22:03:09 PDT 2024

eddyz87 wrote:

> I'd like to figure out how to, and I kindly ask if "memset" is unrolled then it is friendly to the verifier, what should we do if the memset is called with dynamic length? (Expand it, or report error?)

In certain cases verifier would be able to handle dynamic length.
Suppose there are two programs like:

```
# program A                    # program B
1: r1 = 0;                     1:       r1 = 0;  
2: *(u64 *)(r10 - 8) = r1;     2:       r2 = 0;
3: *(u64 *)(r10 - 16) = r1;    3:       r3 = r10;
4: *(u64 *)(r10 - 24) = r1;    4:   l1: r3 += -8;
5: *(u64 *)(r10 - 32) = r1;    5:       *(u64 *)(r3 - 0) = r1;
                               6:       r2 += 1;
                               7:       if r2 != 4 goto l1; 
```
In essence, verifier traces (probably, it is correct to say that it does abstract interpretation) for each execution path in the program, and tries to trim traces that reach equivalent states. This logic makes loops a pain point. For program A verifier would trace path `1,2,3,4,5`; while for program B verifier would trace path `1,2,3,4,5,6,7,4,5,6,7,4,5,6,7,4,5,6,7`, which is significantly longer. The bound at instruction B.7 does not necessarily has to be a constant, it could be a register with a value known to verifier (e.g. if we add instruction B.0 with `r4 = 4` the instruction at B.7 could be replaced by `if r2 != r4 goto l1` and the trace would be the same).
Hence, it is preferable but not strictly necessary to unroll the loops.

https://github.com/llvm/llvm-project/pull/97648