[PATCH] D21723: [RFC] Enhance synchscope representation

Thu Sep 15 11:49:39 PDT 2016

jlebar added a comment.

WRT the langref, Mehdi and I had a conversation on IRC, that I want to summarize here.

AIUI we're trying to accomplish two things here:

1. Give the front-end a way to cause LLVM to generate atomic machine instructions with the "right" scope.  We can't safely drop the scope.

2. Let LLVM optimize these instructions, safely.

If all we cared about were (1), we could just use target-specific intrinsics.  Moreover, if all we cared about were target-specific optimizations on these instructions, again (1) would suffice.

So being able to optimize sequences of these instructions at the IR level is important to this proposal.  Otherwise, why bother?  There would be no advantage over using target-specific intrinsics.

Thus, I want to dig into how optimizations would work under this proposal, in order to try to figure out if it's sufficient for the things we want to do today and also if it's not painting us into a corner for the things we think we'll want to do in the future.

There are two classes of target-agnostic IR optimizations we might want to perform:

- Target-agnostic IR optimizations that are agnostic as to the meaning of the target's syncscopes.
- Target-agnostic IR optimizations that are not agnostic to the meaning of the syncscopes.

Without extra information provided by the target, we can only perform syncscope-meaning-agnostic optimizations from target-agnostic IR passes.  This means that, without extra information provided by the target, we can only optimize a set of atomics when, in a block of code we're considering, all of the atomic operations have the same syncscope.

In other words, without extra information provided by the target about the meaning of the syncscopes, a given block of code with non-syncscope'd atomics will never have more optimization opportunities after adding syncscopes.  Without extra information provided by the target, syncscopes can only limit target-agnostic IR optimization opportunities.

This leads me to my question:

- Will we want, within the foreseeable future, to have target-agnostic IR passes that will make use of the meaning of the syncscopes?

  I sort of think, maybe yes, it seems like a natural extension to this work.  But maybe I'm wrong.

Then the next question:

- Assuming we will want to build target-agnostic IR passes that make use of the meaning of the syncscopes, is this the right design?

  Specific questions related to this:
  - Is specifying the syncscope as a number (as opposed to, I guess, a string, or referencing a metadata node) the right way to go?  On the one hand, using numbers matches what we do for address spaces, which also have target-specific meanings.  On the other hand, anyone know what NVPTX address space 4 means off the top of their heads?
  - What are some examples of optimizations that you might do if you understood what the syncscopes meant?  What information would we need the target to give to our target-independent IR pass in order for it to perform these optimizations?
  - Should the information above be encoded into the module itself, or should it be a property of the target?

================
Comment at: docs/LangRef.rst:2174
@@ +2173,3 @@
+interacts with atomic operations marked ``singlethread``,
+marked ``syncscope(<n>)`` with a different value of ``<n>``,
+or not marked ``singlethread`` or ``syncscope(<n>)``.
----------------
How about "marked ``syncscope(<m>)`` with ``m != n``" or something like that?

https://reviews.llvm.org/D21723