[llvm-dev] pointer provenance introduction to load/store instructions

Tue Jun 15 05:18:00 PDT 2021

Hi Nuno,

> >> As far as I understand, your goal is to declare what's the set of
> >> objects a pointer may alias with at a memory dereference operation.
> >> For example, you may want to say that the following load can only
> >> dereference objects pointed-to by %p and %q:
> >> %v = load i8, * %r
> >>
> >> If %r alias with some object other than %p/%q, the load triggers UB.
> >> This allows you to do better AA.
> >
> > Yes, this should make it possible to optimize something like:
> >
> > int foo(int* a, int *b) {
> >   if ((uintptr_t)a +4) == (uintptr_t) b) {
> >     return b[0];
> >   } else {
> >     return a[1];
> >   }
> > }
> >
> > to something like (pseudo code, assuming 32bit pointers):
> >   %a.gep = getelemenptr %a, 1
> >   %c = cmp %a.gep, %b                                  ; This will not
> result in any code
> >   %prov = select %c, %b, %a                            ; This will also not
> result in any oce
> >   %result = load i32, i32* %a.gep, ptr_provenance i32* %prov
> >   ret i32 %result
> 
> This approach is the reverse of what I was thinking. Instead of restricting
> provenance, you are adding provenance. This is a more dangerous approach, as

I am not sure I understand this. This optimized load instruction has a provenance
that is related to %a and %b. Otherwise, this optimization would not be valid.

For: (note, also true for the variant with explicit control flow)
  %c = cmp %a.gep, %b
  %v1 = load i32, i32* %a.gep
  %v2 = load i32, i32* %b
  %result = select %c, %v2, %v1

'a optimization pass' that is aware of the provenance could generate the code
with merged loads, still tracking the correct provenance.

In the proposal it is such that a %prov == null, would indicate that the load can
have any provenance, so that can be considered as the safe fallback case.

> then provenance information can never be deleted, as it's required for
> correctness. The other way around uses provenance information to aid
> optimization, but it's not required for correctness, thus can be dropped.

Hmm. I think I start seeing the difference.

The first question is what an ordinary load means:
  %L1 = load i32, i32* %p
  %L2 = load i32, i32* %p, ptr_provenance* %p
  %L3 = load i32, i32* %p, ptr_provenance* null

In the current proposal, %L1 and %L2 are equivalent. Aka, no explicit ptr_provenance
means to use the provenance of the provided pointer. %L3 is the way to indicate that
the load's provenance is broader than that of the pointer operand.

You probably had in mind that %L1 is equivalent to %L3, and %L2 needs to be
written explicitly to add a known provenance ?

As far as I am concerned, both interpretation are valid. I choose the first as the
most convenience (and less work), as, we already are doing provenance investigations
today on the pointer operand, so no change of behavior there. And the front end's
do not have to be teached about the provenance. (And keep in mind, my initial focus
was the full restrict implementation).

> 
> So the main caveat of the proposal is that every single optimization touching
> memory operations needs to learn how to preserve & handle this new provenance
> information. Maybe all the changes will be down just to AA & a few utility
> functions, but still, every creation, copy, etc of memory operations needs to
> be audited.
> In general, it's good practice to add new features to the IR such that they
> can be ignored by existing code that doesn't know about them.

These rules-of-thumb were the driver for the approach with Full Restrict. The
clone would not automatically track the noalias provenance, but drop it in a
safe and compatible way. For those optimizations where it made sense, knowledge
about the provenance and noalias metadata were added. The main thing that can be
problematic is the iterating over 'uses' that assume a load/store can only
be seen once.

> 
> 
> >> This is useful when you have the restrict keyword in a function
> >> argument and you inline that function. LLVM right now has no way to
> >> restrict aliasing per scope or operation, just per function.
> >> (this story has been seen by every other attribute..)
> >>
> >> The goal sounds useful. Though it would be nice to see some
> >> performance numbers as this is a complex feature and we need to
> > > understand if it's worth it.
> >
> > In what kind of performance numbers are you interested ?
> 
> I think the first question is around benefits: Are there benchmarks we care
> about that benefit from this patch? Are there regressions? Even though the
> extra code is not materialized in assembly, it still exists and may interact
> with the inliner heuristics, for example.

For now, just having this infrastructure around will have a small cost of some extra
runtime memory for load/store instructions. Once the infrastructure is available,
this opens up a number of new things:
- Once filled in, tracking of provenance can be done more accurately and also faster.
(faster, as most of the computations can be skipped)
- IMHO, it's a mandatory step for adding full restrict support.
- During last AA TechCall, other developers already seemed to have some ideas on how
  to make use of it.

It would be nice if you can make it to the AA Tech Call today ;)

Thanks for the feedback !

Jeroen

> 
> 
> > This is true. In my view, that discussion is more or less orthogonal to what
> the Full Restrict patches add. For Full Restrict we do need to track the
> (noalias) provenance (this is needed for the 'based-on' rule). For that a
> number of helpers were introduced:
> > - llvm.noalias : adds 'restrict/noalias' information to a pointer
> > - llvm.provenance.noalias : adds 'restrict/noalias' information to a pointer
> (ptr_provenance path)
> > - llvm.noalias.arg.guard : combines a computational path with a
> ptr_provenance path:
> > -- Only Load and Store have an explicit ptr_provenance argument
> > -- Other places where the provenance must be tracked (when storing the
> pointer, when passing it to a function, when returning it),
> >   the result of the 'llvm.noalias.arg.guard' is used, as that tracks both
> sides.
> > - llvm.noalias.copy.guard : annotated that a pointer points to a memory
> block containing restrict pointers.
> > -- This allows SROA to identify that a restrict pointer is copied when
> splitting up load/store of aggregates or
> >    replacing memcpy.
> >
> > So, in the assumption that a memcpy and aggregate load/store propagates
> provenance, this allows us to keep track of that provenance.
> 
> Thanks for this quick summary! I need to think more about this explicit
> provenance tracking and how far can we stretch it. This stuff is not trivial
> :)
> 
> Nuno