[LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"

Thu Aug 21 12:17:15 PDT 2014

Here's an attempt to nail down the annotation semantics with support for respecting forward lexical dependences. 

Each load, store, call, or invoke instruction can be labeled with !llvm.mem.vector_loop_access, which has two operands:
* The first operand is an integer denoting lexical position.  The positions need not be consecutive, and may contain duplicates.  
* The second operand is the same as the first operand to llvm.mem.parallel_loop_access.  It's second so that it can be omitted - see mention of inlining further below.
The LoopID can have "llvm.loop.safelen" metadata.

Here is an example with three accesses with positions {10, 15 17} and a safelen of 42.  

    define void @foo(float* %a, float* %b) {
    entry:
      br label %for.body

    for.body:                                         ; preds = %for.body, %entry
      ...
      %0 = load float* %arrayidx, !llvm.mem.vector_loop_access !{metadata i32 10, !0}
      ...
      %1 = load float* %arrayidx2, !llvm.mem.vector_loop_access !{metadata i32 15, !0}
      ...
      store float %add3, float* %arrayidx5, !llvm.mem.vector_loop_access !{metadata i32 17, !0}
      ...
      br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0

    for.end:                                          ; preds = %for.body
      ret void
    }

    !0 = metadata !{metadata !0, metadata !1}
    !1 = metadata !{metadata !"llvm.loop.safelen", i32 42}

Let lex(x) denote the lexical position metadata for access x.  If two accesses A and B:
* Are marked with llvm.mem.vector_loop_access that reference a loop L AND
* lex(A)<lex(B) AND
* L has llvm.loop.safelen with value K
THEN for loop L, the dependence distance from B to A is at least K iterations.

When llvm.mem.vector_loop_access is used on a call/invoke instruction, any accesses therein inherit that lexical position.

Open issue: when inlining a callee with more than one memory access, the accesses will end up with the same lexical position, and thus lose the dependence distance clue.  I don't know how often this drawback would show up in practice.  A possibility is to allow the callee instructions to be annotated with a !llvm.mem.vector_loop_access that omits the LoopId operand, i.e. just has lexical position information.  Then inlining could be more clever.

- Arch Robison
  Intel Corporation