[PATCH] D138563: [IntrisicEmitter] Allow intrinsic types to be dependent on the data layout

Fri Nov 25 07:57:28 PST 2022

arichardson added a comment.

In D138563#3950912 <https://reviews.llvm.org/D138563#3950912>, @arsenm wrote:

> I have mixed feelings about this patch. It feels useful to support the typed matchers, mostly in the context of fixing iPTR (really we should have a more generally parameterized iPTR if we can't fully get rid of it)
>
>> The alternative would be to overload all the remaining intrinsics but that
>> causes a lot of test churn (especially for llvm.va_*/llvm.stack{save,restore}*)
>> and should not be necessary since those intrinsics don't really need to
>> be overloaded, the types just happen to depend on the data layout string.
>
> I think this is the correct solution; I think llvm_ptr_ty should be removed entirely and all intrinsics should be mangled on the pointer address space. It's not composable and we've just been migrating an intrinsic at a time as needed. We've already migrated some intrinsics for the stack use case (e.g. int_frameaddress and lifetime markers) and this switches strategies for a different set so it introduces inconsistency. This is a roadblock to fully supporting per-alloca address spaces, which has come up before as potentially useful. The datalayout address spaces are really for situations where some generic code needs to create a new value without any context, which is not the case with an intrinsic call site.
>
> This is also going to make these intrinsics syntactically behave differently based on the datalayout. We tried something similar with program address spaces, such that the assembler currently interprets an undecorated function's address space differently based on the datalayout and I'm currently hating it. It's inconsistent and confusing and I'll likely post a patch to undo it, such that functions are always printed with an explicit addrspace(N) if it's non-0.
>
> Assuming we do want to go with this approach, this is missing various assembler and verifier tests for all the impacted intrinsics.

I am also happy to go down the overload all intrinsics route, I just posted this as a possible approach that avoids output changes for the common "(almost) everything in AS0" targets.
If others also believe the overloading approach is better, I'll close this and post (large) patches to overload the other intrinsics that we overloaded for CHERI over the past 10 years and failed to upstream. I opened a test PR for CHERI LLVM to see how much this reduces the diff (https://github.com/CTSRD-CHERI/llvm-project/pull/663) and it shows that around 200 test files will have to be updated for overloaded llvm.va*+llvm.stacksave/stackrestore.

Regarding the address space of functions that is probably my fault, but I thought it was always printed? I believe the only thing that I allowed was to allow IR that has calls without an explicit address space to not error if the target global is a function in the program address space - but if you feed it into `llvm-as | llvm-dis` it should be printed again?
That being said it might also be that global function declarations are implicitly being parsed as being in the program address space, but I'd have to double-check. This is rather useful for some of the downstream CHERI tests, but it can also be fixed using `sed` for the handful of tests that are affected. IMO a nicer solution for this would be to allow symbolic constants in the textual IR such as `addrspace(P)`, addrspace(G).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138563/new/

https://reviews.llvm.org/D138563