[PATCH] D12221: [RFC] Introduce `__attribute__((nontemporal))`.

Michael Zolotukhin via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 21 10:35:03 PDT 2015


mzolotukhin added a comment.

Hi all,

Thanks for the feedback, please find my answers below:

> What does it mean to have the attribute applied to non-pointer types like int __attribute__((nontemporal)) i; ? The ACLE doesn't say but making it erroneous might make sense. Perhaps it would be good to have a semantic test which uses __attribute__((nontemporal)).


**David**,
That's a good idea. Actually, I don't know how we should behave in such cases, but probably just giving an error should be fine. And should we handle references in a similar manner (`int __attribute__((nontemporal)) &i)`? I'll update the patch correspondingly if we decide to go with type attributes.

> This seems like a property of an operation, rather than a property of a type. Have you considered adding a __builtin_nontemporal_store builtin as an alternative?


**Richard**,
Yes, I've considered a builitin as an alternative. In fact, I started with it as it was easier to implement, but then decided to switch to type attribute due to the following reasons:

1. ARM ACLE 2.0 mentions attribute. Though it's not a final version of the document, AFAIU, I still preferred to use it as an argument for type-attribute.
2. Once we introduce a builtin, we'll have to support it forever (otherwise we could break someone's code). With the attribute the burden is much smaller, as we can just start ignoring it at any point if we need to - all the code will remain correct and compilable.
3. We'll need to have an intrinsic for every type + separate intrinsics for loads and stores. If we use the type attribute, one fits all.
4. While it's true, that this is more type of operation, than a type, I think in real use-cases a user would rarely need to use it on a single operation. I.e. nontemporal operations are usually used for processing bulk volumes of data, and probably this data is almost always is processed as a whole. That's why I think it's fine to mark the entire 'data' as nontemporal. And, if a user then wants to work with a small subset of it, she can use a usual (not nontemporal) pointer to it.
5. Personally, I find the code using attributes more elegant than using builtins. Compare:

  void foo(float *__attribute__((nontemporal)) dst,
           float *__attribute__((nontemporal)) src1,
           float *__attribute__((nontemporal)) src2) {
    *dst = *src1 + *src2;
  }

and

  void foo(float *dst, float *src1, float *src2) {
    float s1 = __builtin_nontemporal_load(src1);
    float s2 = __builtin_nontemporal_load(src2);
    __builtin_nontemporal_store(s1 + s2, dst);
  }

But that said, in the end I'm open to other alternatives (including builtins), and this thread is just an attempt to find the best option.

> This doesn't seem like a fundamental property of a type, to me. If I understand properly, this has more to do with specific instances of memory access. By making it part of the type, you run into sticky situations that become hard to resolve, such as with templates in C++.


**Aaron**,
As far as I understand, type attributes doesn't result in such complications (as opposed to type qualifiers, e.g. `__restrict__`). That is, it doesn't change the canonical type, it only adds some 'sugar' to it. I.e. ` float *__attribute__((nontemporal))` and `float *` would behave as the same type in templates and names mangling. Please correct me if I'm wrong here.

Thanks,
Michael


http://reviews.llvm.org/D12221





More information about the cfe-commits mailing list