[PATCH] D112466: [NVPTX] Drop memory references of LDG/LDU

Wed Nov 3 07:24:14 PDT 2021

asavonic added a comment.

In D112466#3103803 <https://reviews.llvm.org/D112466#3103803>, @tra wrote:

> While `ldu` does indeed specify that it loads from read-only memory, I do not think we can treat `ld.global.nc` the same way.
> PTX spec says `Load register variable d from the location specified by the source address operand a in the global state space, and optionally cache in non-coherent texture cache. Since the cache is non-coherent, the data should be read-only within the kernel's process.`
>
> The way I read it -- it's a regular load that bypasses cache.

I'm not sure, but I think the spec says that the load is cached:
`Load [...] from the location [...] in the global state space, and optionally cache in non-coherent texture cache.`
The problem is that the cache is non-coherent and optional.

> Unlike `ldu`, it does not specify that it's a read-only data. While read-only data will make it work correctly, it's not the only valid use case.
> E.g. data may be changed from the host and two subsequent `ld.global.nc` will return different values.

You mean something like this, right?

  ld.global.nc r1, [a]
  st.global [a], r2           ; may be updated from host?
  ld.global.nc r3, [a]

I think the result is undefined: r1 and r3 can be equal if the first instruction cached the result. Otherwise, since the cache is optional, r2 and r3 can be equal.
However, what happens for this code?

  st.global [a], r4
  ld.global.nc r5, [a]

Provided that `a` is not cached before the store, `ld.global.nc` should always load the stored value, right?
If this is correct, then `ld.global.nc` can behave like a "load" in some cases, and therefore should have `mayLoad` flag.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112466/new/

https://reviews.llvm.org/D112466