[PATCH] D71128: [NVPTX][FIX] Expand atomics we cannot handle natively in the ISA

Mon Dec 9 11:57:34 PST 2019

tra added inline comments.

================
Comment at: llvm/test/Transforms/AtomicExpand/NVPTX/expand-atomic-rmw-fadd.ll:16
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast float [[LOADED]] to i32
+; CHECK-NEXT:    [[TMP5:%.*]] = cmpxchg i32* [[TMP2]], i32 [[TMP4]], i32 [[TMP3]] seq_cst seq_cst
+; CHECK-NEXT:    [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
----------------
Don't we want to preserve `atomicrmw fadd` in this case and lower it to `atom.add.f32` ? Why do we want to expand here?

================
Comment at: llvm/test/Transforms/AtomicExpand/NVPTX/expand-atomic-rmw-fadd.ll:131
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast double [[LOADED]] to i64
+; CHECK-NEXT:    [[TMP5:%.*]] = cmpxchg i64* [[TMP2]], i64 [[TMP4]], i64 [[TMP3]] seq_cst seq_cst
+; CHECK-NEXT:    [[SUCCESS:%.*]] = extractvalue { i64, i1 } [[TMP5]], 1
----------------
Ditto here and below. We do have `atom.add.f64`

================
Comment at: llvm/test/Transforms/AtomicExpand/NVPTX/expand-atomic-rmw-fsub.ll:5
+
+define float @test_atomicrmw_fadd_f32_flat(float* %ptr, float %value) {
+; CHECK-LABEL: @test_atomicrmw_fadd_f32_flat(
----------------
Functilon name `fadd` does not seem to match the instruction `fsub`. 

================
Comment at: llvm/test/Transforms/AtomicExpand/NVPTX/expand-atomic-rmw-fsub.ll:15
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast float [[LOADED]] to i32
+; CHECK-NEXT:    [[TMP5:%.*]] = cmpxchg i32* [[TMP2]], i32 [[TMP4]], i32 [[TMP3]] seq_cst seq_cst
+; CHECK-NEXT:    [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
----------------
I must be missing something -- I would think that we do *not* want to expand atomicrmw variants which we can lower to an existing instruction, but a lot of the tests show the opposite and expand atomics that have direct support in hardware. The patch subject seems to agree with my assumptions, but the tests appear to contradict it. Is that intentional? If so, what is it that I'm missing? 

================
Comment at: llvm/test/Transforms/AtomicExpand/NVPTX/unaligned-atomic.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -mtriple=nvptx64-unknown-unknown -atomic-expand %s | FileCheck -check-prefix=CHECK %s
+
----------------
Nit: no need for `-check-prefix` as you only using `CHECK` in the test.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71128/new/

https://reviews.llvm.org/D71128