[llvm] [Intrinsics][AArch64] Add intrinsics for masking off aliasing vector lanes (PR #117007)

Mon Sep 1 05:38:03 PDT 2025

================
@@ -24188,13 +24186,19 @@ Overview:
 
 Given a vector store to %ptrA followed by a vector load from %ptrB, this
 instruction generates a mask where an active lane indicates that the
-read-after-write sequence can be performed safely for that lane, without the
-danger of it turning into a write-after-read sequence.
+read-after-write sequence can be performed safely for that lane, without a
+read-after-write hazard occurring or a a new store-to-load forwarding hazard
+being introduced.
 
 A read-after-write hazard occurs when a read-after-write sequence for a given
 lane in a vector ends up being executed as a write-after-read sequence due to
 the aliasing of pointers.
 
+A store-to-load forwarding hazard occurs when a vector store writes to an
+address that partially overlaps with the address of a subsequent vector load.
+Only the overlapping addresses can be forwarded to the load if the data hasn't
+been written to memory yet.
----------------
sdesmalen-arm wrote:

The issue is that the load can't be performed until the write has completed, resulting in a stall that did not exist when executing as scalars. So perhaps you can write instead:
```suggestion
A store-to-load forwarding hazard occurs when a vector store writes to an
address that partially overlaps with the address of a subsequent vector load,
meaning that the vector load can't be performed until the vector store has completed.
```

https://github.com/llvm/llvm-project/pull/117007