[llvm-dev] pointer provenance introduction to load/store instructions

Jeroen Dobbelaere via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 15 03:07:08 PDT 2021

Hi Nuno,

thanks for the interest and feedback !

> -----Original Message-----
> From: Nuno Lopes
> Subject: RE: [llvm-dev] pointer provenance introduction to load/store instructions
> As far as I understand, your goal is to declare what's the set of objects a
> pointer may alias with at a memory dereference operation.
> For example, you may want to say that the following load can only dereference
> objects pointed-to by %p and %q:
> %v = load i8, * %r
> If %r alias with some object other than %p/%q, the load triggers UB. This
> allows you to do better AA.

Yes, this should make it possible to optimize something like:

int foo(int* a, int *b) {
  if ((uintptr_t)a +4) == (uintptr_t) b) {
    return b[0];
  } else {
    return a[1];

to something like (pseudo code, assuming 32bit pointers):
  %a.gep = getelemenptr %a, 1
  %c = cmp %a.gep, %b                                  ; This will not result in any code
  %prov = select %c, %b, %a                            ; This will also not result in any oce
  %result = load i32, i32* %a.gep, ptr_provenance i32* %prov
  ret i32 %result

Note: in the Full Restrict patches, I tried to be as specific as possible.
Later optimiziations might reveal the %c is always true or always false, which
can help with refining the ptr_provenance info. This does not prohibit the
addition of helper intrinsics that can take a more generic approach like:
  %prov = llvm.provenance.join %a, %b

> This is useful when you have the restrict keyword in a function argument and
> you inline that function. LLVM right now has no way to restrict aliasing per
> scope or operation, just per function.
> (this story has been seen by every other attribute..)
> The goal sounds useful. Though it would be nice to see some performance
> numbers as this is a complex feature and we need to understand if it's worth
> it.

In what kind of performance numbers are you interested ?
- The optional ptr_provenance argument on its own has some impact on compile time and memory usage.
See https://reviews.llvm.org/D68488#2270178 and https://reviews.llvm.org/D68488#2284762 for a discussion
on some variants and their compile time impact. 
Also see: https://llvm-compile-time-tracker.com/index.php?branch=dobbelaj-snps/perf/test_instructions_20200907-02

At this moment, we are using 'variant 1':
  variant 1 : always reserve place for the ptr_provenance; Move arguments around when adding/removing ptr_provenance

    extra space overhead for load/store instructions;
    NO impact on other instructions;
    slightly more overhead when accessing operands of load/store instructions;
    medium overhead when adding/removing ptr_provenance

- The effect of supporting full restrict on compile time ? For compilation time impact there has been a measurement in September:

of course, this does not take into account the effect on code that is using local restrict and restrict pointes in aggregates.
(which, for our customers, is extremely important). The full restrict support does trigger on functions returning an aggregate
that are inlined, as those are modelled with a noalias pointer argument.

> Nevertheless, I would not claim this feature to be related with the discussion
> around the byte type. There are only 2 solutions that avoid introducing the
> byte type: 1) disable integer optimizations that are wrong for pointers, so
> that integers can carry provenance information. This has been tried by another
> compiler, CompCert, and didn't go very well.
> Solution 2) is to lose provenance with integers and weaken AA substantially.
> While your proposal could help recovering precision a bit, it would never
> recover much, as it's a very syntactic approach. The compiler needs to find
> the memory dereference operations and associate them with the right provenance
> shrinking operations syntactically. Doesn't sound very promising, and it would
> only act as a cache, essentially. It's not that the front-end could give you
> much information. And we can prove this approach it's theoretical inferior to
> the byte type, as some escaping scenarios are not possible to recover from
> unless we disable type punning through memory (i.e., make the round-trip of
> store/load a value a NOP).
> Of course, the practicality of the by type is yet to be proven, though work is
> undergoing.

This is true. In my view, that discussion is more or less orthogonal to what the Full Restrict
patches add. For Full Restrict we do need to track the (noalias) provenance (this is needed for
the 'based-on' rule). For that a number of helpers were introduced:
- llvm.noalias : adds 'restrict/noalias' information to a pointer
- llvm.provenance.noalias : adds 'restrict/noalias' information to a pointer (ptr_provenance path)
- llvm.noalias.arg.guard : combines a computational path with a ptr_provenance path:
-- Only Load and Store have an explicit ptr_provenance argument
-- Other places where the provenance must be tracked (when storing the pointer, when passing it to a function, when returning it),
  the result of the 'llvm.noalias.arg.guard' is used, as that tracks both sides.
- llvm.noalias.copy.guard : annotated that a pointer points to a memory block containing restrict pointers.
-- This allows SROA to identify that a restrict pointer is copied when splitting up load/store of aggregates or
   replacing memcpy.

So, in the assumption that a memcpy and aggregate load/store propagates provenance, this allows us to keep track of that provenance.

Note: I am not claiming that this setup is perfect, but it has helped us providing full restrict support to our customers.

> Regardless, I'm happy to review your proposal if useful.
> Nuno
> -----Original Message-----
> From: Jeroen Dobbelaere
> Sent: 14 June 2021 23:57
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] pointer provenance introduction to load/store instructions
> Hi all,
> as some of you already know, the full restrict patches[0] introduce an
> optional 'ptr_provenance' operand to LoadInst/StoreInst. In the context of
> full restrict support, this operand is used to separate the computation of the
> pointer operand from the possible restrict provenance of that operand. This
> was necessary so that we could keep all the optimizations working on those
> pointers, without being blocked by the noalias intrinsics.
> At the same time we avoided the risk to lose the noalias provenance of the
> instruction.
> As that separation can also be useful in other situations, there was a request
> in the past LLVM Alias Analysis Tech calls[1], to split out the pointer
> provenance changes and propose them as a next step for inclusion.
> Fast forward to today. A lot more discussions about pointer provenance and
> related problems are ongoing.
> Maybe the provided set of patches[2] and the experience with tracking the
> provenance in the full restrict patches can be a first step to help solving
> some of those problems ?
> Any feedback is welcome, also during tomorrows AA TechCall[1]
> Greetings,
> Jeroen Dobbelaere
> [0]
> https://urldefense.com/v3/__https://reviews.llvm.org/D68484__;!!A4F2R9G_pg!OgS
> baFpyQKPfdb1fZZPBAMeUsO6AY8xnF4tVYgKRkfFCh_QRsxhFIFZDN0G_x6EX3xIg2FxX$  [PATCH
> 01/27] [noalias] LangRef: noalias intrinsics and ptr_provenance documentation.
> [1] https://urldefense.com/v3/__https://docs.google.com/document/d/17U-
> WvX8qyKc3S36YUKr3xfF-
> GHunWyYowXbxEdpHscw/edit?usp=sharing__;!!A4F2R9G_pg!OgSbaFpyQKPfdb1fZZPBAMeUsO
> 6AY8xnF4tVYgKRkfFCh_QRsxhFIFZDN0G_x6EX38QGhOaL$   LLVM AA TechCall
> [2]
> https://urldefense.com/v3/__https://reviews.llvm.org/D104268__;!!A4F2R9G_pg!Og
> SbaFpyQKPfdb1fZZPBAMeUsO6AY8xnF4tVYgKRkfFCh_QRsxhFIFZDN0G_x6EX35POcUpq$
> [ptr_provenance] Introduce optional ptr_provenance operand to load/store

More information about the llvm-dev mailing list