[llvm-dev] [RFC] Introducing a byte type to LLVM

Thu Jun 24 00:47:48 PDT 2021

Hi Ralf,

My interpretation (well not just mine, we did have discussions about this in our group)
wrt to restrict handling, is that the use of decrypt/encrypt
triggers undefined behavior. Aka, we are not forced to try to track the 
restrict dependency for this case. That is important if you want to optimize
restrict annotated accesses vs not-annotated accesses.

At the time that I came up with the implementation, that was also a convenient fallback
to avoid some of the pitfalls. It made thinking about the solution 'easier'.
For our customers, getting the pointer based use cases working also had the highest priority.

Now that we are going over the different pieces of the implementation and see how we can use
them in a broader context, the situation is different: instead of just tracking
the 'restrict/noalias' provenance, we now want to use that part of the infrastructure to
track provenance in general. Because of that, it also makes sense to reconsider what 'policy'
we want to use. In that context, mapping a 'int2ptr' to a 'add_provenance(int2ptr(%Decrypt), null)'
indicating that it can point to anything makes sense, but is still orthogonal to the infrastructure.

For this particular example, it would also be nice if we could somehow indicate that the
'decrypt(encrypt(%P))' can only depend on %P. But that is another discussion.

Greetings,

Jeroen

> -----Original Message-----
> From: Ralf Jung <jung at mpi-sws.org>
> Sent: Thursday, June 24, 2021 09:02
> To: Jeroen Dobbelaere <dobbel at synopsys.com>; Juneyoung Lee
> <juneyoung.lee at sf.snu.ac.kr>; Nicolai Hähnle <nhaehnle at gmail.com>; llvm-
> dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] Introducing a byte type to LLVM
> 
> Hi again Jeroen,
> 
> >> However, I am a bit worried about what happens when we eventually add
> proper
> >> support for 'restrict'/'noalias': the only models I know for that one
> actually
> >> make 'ptrtoint' have side-effects on the memory state (similar to setting
> the
> >> 'exposed' flag in the C provenance TS). I can't (currently) demonstrate
> that
> >
> > For the 'c standard', it is undefined behavior to convert a restrict pointer
> to
> > an integer and back to a pointer type.
> >
> > (At least, that is my interpretation of n2573 6.7.3.1 para 3:
> >     Note that "based" is defined only for expressions with pointer types.
> > )
> 
> After sleeping over it, I think I want to push back against this
> interpretation
> a bit more strongly. Consider a program snippet like
> 
> int *out = (int*) decrypt(encrypt( (uintptr_t)in  ));
> 
> It doesn't matter what "encrypt" and "decrypt" do, as long as they are
> inverses
> of each other.
> "out" is definitely of pointer type. And by the dependency-based definition of
> the standard, it is the case that modifying "in" to point elsewhere would also
> make "out" point elsewhere. Thus "out" is 'based on' "in". And hence it is
> okay
> to use "out" to access the object "in" points to, even in the presence of
> 'restrict'.
> 
> Kind regards,
> Ralf
> 
> >
> > For the full restrict patches, we do not track restrict provenance across a
> > ptr2int, except for the 'int2ptr(ptr2int %P)' (which we do, as llvm
> sometimes
> > introduced these pairs; not sure if this is still valid).
> >
> > Greetings,
> >
> > Jeroen Dobbelaere
> >
> >> this is *required*, but I also don't know an alternative. So if this
> remains
> >> the
> >> case, and if we say "load i64" performs a ptrtoint when needed, then that
> >> would
> >> mean we could not do dead load elimination any more as that would remove
> the
> >> ptrtoint side-effect.
> >>
> >> There also is the somewhat conceptual concern that LLVM ought to have a
> type
> >> that can loslessly hold all kinds of data that exist in LLVM. Currently,
> that
> >> is
> >> not the case -- 'iN' cannot hold data with provenance.
> >>
> >> Kind regards,
> >> Ralf
> >
> 
> --
> Website: https://urldefense.com/v3/__https://people.mpi-
> sws.org/*jung/__;fg!!A4F2R9G_pg!OxsbBsUqT_ORztvmiL8KMQVNFdMPVYluQbPvIfVWl8KHjQ
> dIXhSF65d6sByCus-4fqepGR7h$