[PATCH] D26157: [OpenCL] always use SPIR address spaces for kernel_arg_addr_space MD

Thu Nov 3 08:29:31 PDT 2016

Anastasia added a comment.

> Is there nowadays such a thing as "standard OpenCL logical AS IDs" which could be retained down to code gen? I must say I haven't checked the current situation here. It used to be that the logical ids are assumed to be converted to the target ones already in Clang and I'm afraid changing this requires a major rework.

I think it's really hard topic and people probably have different interpretations. My understanding is that Clang AST contains source level AS  (i.e. OpenCL and CUDA) and translates down to logical AS in IR (different AS in CUDA and OpenCL could map to the same AS in IR). I don't think those ASes in IR are fixed or agreed to have some defined semantic either. This is probably the source of major confusions. Clang translates to target (i.e. IR) address spaces which should be lowered by LLVM as late as possible into physical ones when the address space information is not needed any more.

But I think there are a lot of different implementations and shortcuts to either translate early or late at the moment. It's hard to say what's best. I guess as soon as it works and generic enough for everyone needs. But in a longer term some better understanding and common concepts should be found in my opinion.

I must say for this particular change it will probably be confusing to see different AS numbers between IR operations and MD in case some targets have different mapping than SPIR. Perhaps, we could discuss alternative strategies.

> pocl has used the fake-address-space ids until now for exactly this -- retaining the logical AS info in the IR which can be exploited for disjoint address space AA, but is also required for handling the different AS kernel arguments as local arguments must be treated differently memory allocation wise.
> 
> However, as all backends are not expected to support mapping the fake address spaces to their internal ones (in single AS it's trivial to simply ignore the AS ids, but for multi-AS machines there has to be explicit mapping) we have had an IR pass that converts the address spaces to the target's before code gen. This pass we call TargetAddressSpaces has grown way too complex and is a major source of bugs all the time.
> 
> Also, another source of bugs is the fact that many passes simply ignore address spaces as they have been developed for single AS machines and only tested on them. This leads to bugs where the AS ID info is silently dropped (converted to 0) which makes them hard to catch.  If the pointer creation APIs of LLVM were forced to include the AS ID in the construction, it might yield out majority of these issues -- as long as the coders respect the fact that there can be multiple ASs and not simply use 0 there all the time.
> 
> Also, some optimizations such as vectorization might get confused in case it sees non-0 address spaces for CPU targets (e.g. there might not be vectorized intrinsics available for non-0 ASs).

I think it's the same for the frontend actually. Clang hasn't been written with AS in mind and although with time most problems have been fixed, many commits still revolve around various AS issues.

Is it something that should be addressed by LLVM community? Something that we could discuss to be improved on llvm-dev list? I am guessing a lot of contributors that work with segmented memory architectures can benefit from this work.

> Etc. Thus, due to the limited time our group has available for hunting the bugs that stem from this, I decided it might be best to avoid the use of the "logical IDs" inside IR for now and think about how to implement the disjoint AA without them later on.

Repository:
  rL LLVM

https://reviews.llvm.org/D26157