[llvm-dev] [RFC] : LLVM IR should allow bitcast between address spaces with the same size

Mon Nov 29 10:15:03 PST 2021

> On Nov 29, 2021, at 08:07, Renato Golin via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> On Mon, 29 Nov 2021 at 12:50, Sankisa, Krishna (Chaitanya) via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> This bitcast is intended to be used by transform passes when its been decided that the resultant operation is no-op( Like in GVN case of inserting ptrtoint/inttoptr ).
> When reinterpretation of bits is not supported by target, the resultant value is poison.
> 
> This will depend on the target's own interpretation of what each address space is meant to be.

No, this change doesn’t really have to do with the address spaces at all. It’s fixing the representation of a no-op cast so the optimizer does not not have to introduce ptrtoint. It isn’t intended as a semantic change for the interpretation of address spaces or address space casts.

> 
> Each optimisation that uses this will have to ask the target info classes for the similarity (for some definition of) between two address spaces and then checking if the cast is actually a no-op or not.

The point here is this optimization does not care about whether a proper address space cast is a no-op or not. The original code was not performing an address space cast, it was doing type punning and reinterpreting some bits as a different address space.

> 
> As Serge said, some targets use address spaces for things that are completely segregated even if they have the same data layout (ex. [1]). Bit-casting on those cases is an invalid operation.

This is true, but this isn’t directly related to this change. If the target does not have no-op conversions between a given pair of address spaces, the original code was broken to begin with. This change does not allow introducing new illegal address space conversions that were not already present in the original program. It’s garbage in, garbage out. If the target can’t really reinterpret these pointers, it would be UB on access.

> 
> As a result, the infrastructure that allows this must be extremely conservative (ie. return false on the base class) and only return true after enough checks are done by the target library, and then it's up to that target to allow/deny bit-casts between address spaces (which at that point, don't even need to have the same data layout).
This would be introducing target dependence on core IR instruction semantics, which would be bad. The transformation this intends to fix would not be improved by target information. This pointer-reinterpret-to-cast transformation is and should be done unconditionally. If we were to introduce target information here, it would look something like if (isNoopCast()) { doNoOpAddrSpaceCast } else { doPtrToIntIntToPtr}. This doesn't avoid the need to introduce ptrtoint, it’s just given some targets better IR some of the time.

> 
> In a nutshell, I don't think *only* looking at the data layout will be safe. You need to dig deeper and make that a target decision, and allow passes to completely ignore this specific pattern (ie. avoid searching the whole IR for it) if there's no chance it will ever be useful. This makes sure targets that don't support that don't pay the penalty of traversing the IR for no benefit, because the response will always be `false`.
To reiterate, there is no target specific pattern to search for here. There is a general type punning pattern and we need to introduce new IR that represents the same type punning, only using IR values instead of memory. Currently the only way to represent this is with an inttptr/ptrtoint pair, which everyone agrees the optimizers should never introduce.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211129/3d288d40/attachment.html>