[PATCH] D12785: Document __builtin_nontemporal_load and __builtin_nontemporal_store.

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 10 18:29:55 PDT 2015


On Thu, Sep 10, 2015 at 5:57 PM, Michael Zolotukhin <mzolotukhin at apple.com>
wrote:

> mzolotukhin added inline comments.
>
> ================
> Comment at: docs/LanguageExtensions.rst:1802-1807
> @@ +1801,8 @@
> +
> +For example, on AArch64 in the following code::
> +
> +  LDR X1, [X2]
> +  LDNP X3, X4, [X1]
> +
> +the ``LDNP`` might be executed before the ``LDR``. In this case the load
> would
> +be performed from a wrong address (see 6.3.8 in `Programmer's Guide for
> ARMv8-A
> ----------------
> rsmith wrote:
> > This seems to make the feature essentially useless, since you cannot
> guarantee that the address register is set up sufficiently far before the
> non-temporal load. Should the compiler not be required to insert the
> necessary barrier itself in this case?
> Yes, we can require targets to only use corresponding NT instructions when
> it's safe, and then remove this remark from the documentation. For ARM64
> that would mean either not to emit LDNP at all, or conservatively emit
> barriers before each LDNP (which probably removes all performance benefits
> of using it) - that is, yes, non-temporal loads would be useless on this
> target.
>

I think this should already be the case -- according to the definition of
!nontemporal in the LangRef (
http://llvm.org/docs/LangRef.html#load-instruction), using an LDNP without
an accompanying barrier would not be correct on AArch64, as it does not
have the right semantics.


> But I think we want to keep the builtin for NT-load, as it's a generic
> feature, not ARM64 specific. It can be used on other targets - e.g. we can
> use this in x86 stream builtins, and hopefully simplify their current
> implementation. I don't know about non-temporal operations on other
> targets, but if there are others, they can use it too right out of the box.


Yes, I'm not arguing for removing the builtin, just that the AArch64
backend needs to be very careful when mapping it to LDNP, because that will
frequently not be correct.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150910/0c0220a6/attachment.html>


More information about the cfe-commits mailing list