[PATCH] D71128: [NVPTX][FIX] Expand atomics we cannot handle natively in the ISA

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 10 09:13:55 PST 2019


tra added inline comments.


================
Comment at: llvm/test/Transforms/AtomicExpand/NVPTX/expand-atomic-rmw-fsub.ll:15
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast float [[LOADED]] to i32
+; CHECK-NEXT:    [[TMP5:%.*]] = cmpxchg i32* [[TMP2]], i32 [[TMP4]], i32 [[TMP3]] seq_cst seq_cst
+; CHECK-NEXT:    [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
----------------
arsenm wrote:
> jdoerfert wrote:
> > tra wrote:
> > > I must be missing something -- I would think that we do *not* want to expand atomicrmw variants which we can lower to an existing instruction, but a lot of the tests show the opposite and expand atomics that have direct support in hardware. The patch subject seems to agree with my assumptions, but the tests appear to contradict it. Is that intentional? If so, what is it that I'm missing? 
> > It is not intentional to pesimise anything, as mentioned above. The problem is I am neither a backend nor NVPTX person and my changes do seem to have unwanted effects I cannot even categorize.
> For the purpose of this change, that this isn't optimal doesn't matter. These aren't implemented correct, but doing so is a separate change and those changes will show up in the same tests here
OK. Looks like `atomicrmw fsub` currently fails to lower on NVPTX, so expanding it is an improvement. However, expanding `atomicrmw fadd` is a substantial regression and is likely to be a showstopper. Atomic FP32 addition is a commonly used instruction in various reduction kernels so anything that prevents mapping it to `atom.add.f32` instruction will be very noticeable. 

I realize that there are many moving parts involved in getting this to work properly. If proper fix needs multiple patches, please try to commit them atomically to avoid the performance regression in between those changes.

Also, if there are dependent patches, it would be great to arrange all of them as such in phabricator, so it's easier to see the big picture.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71128/new/

https://reviews.llvm.org/D71128





More information about the llvm-commits mailing list