[PATCH] D18200: Add __atomic_* lowering to AtomicExpandPass.

Fri Mar 18 11:46:20 PDT 2016

jyknight added inline comments.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:181
@@ +180,3 @@
+
+    // If the Size/Alignment is not supported, replace with a libcall.
+    if (LI) {
----------------
reames wrote:
> The current structure of this code will break floating point and pointer atomic canonicalization.  The previous code fell through, the new code does not.  I suspect you need to swap the order of the two steps.  
I don't believe so. This code handles those types, via appropriate calls to Builder.CreateBitOrPointerCast.

Also see the test cases starting with test_load_double. I'm missing a test for pointers, there, but it does handle them too.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:970
@@ +969,3 @@
+  }
+  llvm_unreachable("Unknown atomic memory order.");
+}
----------------
reames wrote:
> This should be the default in the switch.
No it shouldn't; written this way, if someone adds new cases it will emit a compiler warning. That's not terribly important here, but it's a good idiom.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:980
@@ +979,3 @@
+                                  const DataLayout &DL) {
+  // TODO: "LargestSize" is an approximation for "largest type that
+  // you can express in C". It seems to be the case that int128 is
----------------
reames wrote:
> Wait, what?  Why do we need this?  We can simply test whether we have a particular sized libcall.  The enable logic should be strictly contained there.
> 
> This strongly hints to me we should split the libcall part out into it's own change.  This would also make it more natural to write tests for the lowering.  (Which tests are missing from the current patch and required.)
> We can simply test whether we have a particular sized libcall.

Simply test how?? What do you mean?

> This strongly hints to me we should split the libcall part out into it's own change. 

What? This entire patch is about generating libcalls. What do you mean split it into its own change? Separate from what other part?

> (Which tests are missing from the current patch and required.)

Which tests would you like to see? There are some tests already; I can add more, but I don't know what you actually mean by that, either.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:1000
@@ +999,3 @@
+
+  if (!expandAtomicOpToLibcall(I, Size, Align, I->getPointerOperand(), nullptr,
+                               nullptr, I->getOrdering(),
----------------
reames wrote:
> Use an assert.
You mean like:
  bool expanded = expandAtomicOpToLibcall(...);
  assert(expanded);

Not sure why that's better, but sure.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:1110
@@ +1109,3 @@
+  // CAS libcall, via a CAS loop, instead.
+  if (!Success) {
+    expandAtomicRMWToCmpXchg(I, [this](IRBuilder<> &Builder, Value *Addr,
----------------
reames wrote:
> This would be much more naturally expressed at the callsite by having this function simply return false when the expansion doesn't work, and then applying the fallback along with the second call.  
I don't know what you mean by that.

However, I do see that I can remove the 3 Builder.* calls below by simply calling createCmpXchgInstFun, which is the same thing.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:1178
@@ +1177,3 @@
+  // Build up the function call. There's two kinds. First, the sized
+  // variants.  These calls are going to be one of the following (with
+  // N=1,2,4,8,16):
----------------
reames wrote:
> What I was suggesting was that you first found the generic libcall, and then given the size information, selected the sized version if one was available.  I find the mixing of sized and unsized functions in your current array structure very confusing and hard to follow.
> 
> I hadn't realized there were no generic versions for some of them.  That seems unfortunately inconsistent.
> 
> Also, I strongly disagree with your position that we should optimize manually written generic calls into specific ones.  If it truly is an optimization, we should perform it.  If nothing else, it removes a lot of complexity from the potential frontends.  
> 
If you could please be more specific about what you found confusing maybe I can try to address that.

However, regarding the "optimization" of generic libcalls into sized libcalls: no, that just makes no sense to do. It will not do anything but obfuscate things. I'll reiterate: nobody should ever be writing calls to these libcalls themselves. Even doing so will take some effort if you're trying to do that from C.

It also cannot possibly reduce any frontend complexity, because the frontend builtins which users call are //not the same// as these libcalls. The frontend recognizes the builtins, and lowers to LLVM atomic IR instructions. Sometimes, now, clang also lowers the builtins to these libcalls at its layer, but the plan is to remove that, once the support in LLVM itself is all in place.

For example, compare (from https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html):
  Built-in Function: bool __atomic_compare_exchange_n (type *ptr, type *expected, type desired, bool weak, int success_memorder, int failure_memorder)
  Built-in Function: bool __atomic_compare_exchange (type *ptr, type *expected, type *desired, bool weak, int success_memorder, int failure_memorder)

(Note the "n" in "__atomic_compare_exchange_n" is literal letter n, not a stand-in for a number.)

with the libcall documented below (and in https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary):
  bool  __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired,
                                    int success_order, int failure_order)
  bool  __atomic_compare_exchange(size_t size, void *ptr, void *expected,
                                  void *desired, int success_order,
                                  int failure_order)
(with "N=1,2,4,8,16.)

They are just completely different.

================
Comment at: lib/CodeGen/AtomicExpandPass.cpp:1243
@@ +1242,3 @@
+    } else {
+      AllocaValue = AllocaBuilder.CreateAlloca(ValueOperand->getType());
+      AllocaValue->setAlignment(AllocaAlignment);
----------------
reames wrote:
> The difference between generic (allocas) and specific call signatures is large enough that trying to share the code is actively confusing.  Please just split this method based on the UseSizedLibcall variable or extract helper functions.
I actually had it split along that axis in a former version of the patch, but I didn't like it, because there were two copies of basically the same function, with just minor variations. I found I was flipping back and forth trying to see what the differences were. So, I merged the two, and like it better this way.

================
Comment at: test/Transforms/AtomicExpand/SPARC/libcalls.ll:48
@@ +47,3 @@
+; CHECK:  store i16 %old, i16* %2, align 2
+; CHECK:  %4 = call zeroext i1 @__atomic_compare_exchange_2(i8* %1, i8* %3, i16 %new, i32 5, i32 0)
+; CHECK:  %5 = load i16, i16* %2, align 2
----------------
davidxl wrote:
> Are these casting from i16 * -> i8 * necessary?
> 
> Also should __atomoic_compare_exchange_2's  4th argument  be a bool value selecting weak/strong form of compare and exchange ?
The casts aren't strictly necessary, in that all pointers are equivalent. But, the code needed to choose something to be the signature of the libcall, and it was easier to just use i8* everywhere (which, btw, is the same thing clang is doing today). When the typeless pointers work is done, that'll go away of course.

No, __atomic_compare_exchange_* don't have a weak/strong argument in the libcall. Only in the frontend builtin.

================
Comment at: test/Transforms/AtomicExpand/SPARC/libcalls.ll:77
@@ +76,3 @@
+; CHECK:  %1 = bitcast i128* %arg to i8*
+; CHECK:  %2 = alloca i128, align 16
+; CHECK:  %3 = bitcast i128* %2 to i8*
----------------
davidxl wrote:
> are the checks for alloca, bitcast necessary for the test?
You mean: don't check the whole function output, just some particularly representative lines? I think it's pretty important to check all of the instructions, since each one has some distinct code to emit it.

http://reviews.llvm.org/D18200