[llvm-dev] Element atomic vector stores: just do it?

Fri Aug 6 05:54:36 PDT 2021

> On Aug 6, 2021, at 6:16 AM, Max Kazantsev via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> So if we somehow enforce lowering of the vector stores into hardware supported operations and prohibit other passes from tearing it apart, we’ll have ymm loads aligned by some basic type (say i32). It’s a widely known that on X86, despite xmm/ymm stores are not atomic, they don’t tear words if they are aligned by the width of the word (please correct me if it’s not true!).

As a practical matter of implementation, it is hard to imagine that an implementation wouldn’t use at least word-size aligned accesses if it split an SSE/AVX op, and as far as I know every actual implementation has done so, but this is not actually guaranteed by the x86 ISA, nor by Intel’s or AMD’s documentation.

– Steve