[PATCH] D14489: [AArch64] Applying load pair optimization for volatile load/store

Tim Northover via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 13 12:24:18 PST 2015


t.p.northover added a comment.

For "Normal" memory the uArch has quite a bit of freedom to do this kind of optimisation, but that's probably fine because you can't detect it. "Device" memory is designed to turn off that kind of thing though, and that's the primary valid use for volatile.

The main point that now worries me is this note I've just discovered in B2.8.2 (Device memory):

> An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on page B2-81 might be abandoned as a result of an exception being taken during the sequence of accesses. On return from the exception the instruction is restarted, and therefore one or more of the memory locations might be accessed multiple times. This can result in repeated accesses to a location where the program only defines a single access. For this reason, ARM strongly recommends that no accesses to Device memory are performed from a single instruction that spans the boundary of a translation granule or which in some other way could lead to some of the accesses being aborted.


I think ldp/stp operations could be aborted like that (B2-81 specifically says they generate 2 single-copy atomic accesses), so Chad's probably right that we can't do anything for volatile. Well done Chad!


http://reviews.llvm.org/D14489





More information about the llvm-commits mailing list