[llvm-dev] [RFC] Introducing a byte type to LLVM
Ralf Jung via llvm-dev
llvm-dev at lists.llvm.org
Wed Jun 23 12:09:10 PDT 2021
Hi Jeroen,
>> To add to what Juneyoung said:
>> I don't think that experiment has been made. From what I can see, the
>> alternative you propose leads to an internally consistent model -- one "just"
>> has to account for the fact that a "load i64" might do some transformation on
>> the data to actually obtain an integer result (namely, it might to ptrtoint).
>>
>> However, I am a bit worried about what happens when we eventually add proper
>> support for 'restrict'/'noalias': the only models I know for that one actually
>> make 'ptrtoint' have side-effects on the memory state (similar to setting the
>> 'exposed' flag in the C provenance TS). I can't (currently) demonstrate that
>
> For the 'c standard', it is undefined behavior to convert a restrict pointer to
> an integer and back to a pointer type.
>
> (At least, that is my interpretation of n2573 6.7.3.1 para 3:
> Note that "based" is defined only for expressions with pointer types.
> )
>
> For the full restrict patches, we do not track restrict provenance across a
> ptr2int, except for the 'int2ptr(ptr2int %P)' (which we do, as llvm sometimes
> introduced these pairs; not sure if this is still valid).
Interesting. I assumed that doing ptr2int, then doing whatever you want with
that value (say, AES encrypt and then decrypt it), and then turning the same
value back into a pointer, must always produce a pointer that is "at least as
usable" as the one that we started with. I would interpret the parts of the
standard that talk about integer-pointer casts that way.
(That's the problem with axiomatic standards: it is very easy to have mutually
contradicting axioms...)
FWIW, Rust's use of LLVM 'noalias' pretty much relies on this. It would be
rather disastrous for Rust if 'noalias' pointers cannot be cast to integers,
cast back (potentially in a different function), and used.
The C standard definition of 'restrict' is based on hypothetical alternative
executions of the program with different inputs. I can't even imagine any
reasonable way to interpret that unambiguously, so honestly I don't see how that
is even a starting point for a precise formal definition that one could prove
theorems about.^^
The ideas colleagues and me discussed for this more evolved around the idea of
having more than one "provenance" for an allocation (so when a pointer is passed
to a function as 'restrict' argument, it gets a fresh "ID" into its provenance),
and then ensuring that the different provenances on one allocation are used
consistently. But then when you cast a ptr to an int you basically have to mark
that particular provenance as 'exposed' (losing all 'restrict' advantages) to
have any chance of handling the case of casting the int back to a ptr. That
seems fair to me honestly, if you cast a ptr to an int you cannot reasonably
expect alias analysis to make heads or tails of what you are doing. But then
'ptrtoint' has a side-effect and cannot be removed even if the result is unused.
Kind regards,
Ralf
>
> Greetings,
>
> Jeroen Dobbelaere
>
>> this is *required*, but I also don't know an alternative. So if this remains
>> the
>> case, and if we say "load i64" performs a ptrtoint when needed, then that
>> would
>> mean we could not do dead load elimination any more as that would remove the
>> ptrtoint side-effect.
>>
>> There also is the somewhat conceptual concern that LLVM ought to have a type
>> that can loslessly hold all kinds of data that exist in LLVM. Currently, that
>> is
>> not the case -- 'iN' cannot hold data with provenance.
>>
>> Kind regards,
>> Ralf
>
--
Website: https://people.mpi-sws.org/~jung/
More information about the llvm-dev
mailing list