[flang-commits] [flang] [flang][cuda] Lower simple host to device data transfer (PR #85960)

via flang-commits flang-commits at lists.llvm.org
Thu Mar 21 01:38:22 PDT 2024


================
@@ -3706,15 +3706,40 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     return false;
   }
 
+  static void genCUDADataTransfer(fir::FirOpBuilder &builder,
+                                  mlir::Location loc, bool lhsIsDevice,
+                                  hlfir::Entity &lhs, bool rhsIsDevice,
+                                  hlfir::Entity &rhs) {
+    if (rhs.isBoxAddressOrValue() || lhs.isBoxAddressOrValue())
+      TODO(loc, "CUDA data transfler with descriptors");
+    if (lhsIsDevice && !rhsIsDevice) {
+      auto transferKindAttr = fir::CUDADataTransferKindAttr::get(
+          builder.getContext(), fir::CUDADataTransferKind::HostDevice);
+      // device = host
+      if (!rhs.isVariable()) {
+        auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, rhs);
+        builder.create<hlfir::AssignOp>(loc, rhs, temp, false, false);
----------------
jeanPerier wrote:

You should use `hlfir::genAssociateExpr` + `hlfir.end_associate` here instead, this will turn into a no-op if the expression needs to be evaluated in memory while it is not guaranteed that the expression will be evaluated in the assignment RHS with the hlfir.assign op (so this would end-up creating an extra array temporary).

https://github.com/llvm/llvm-project/pull/85960


More information about the flang-commits mailing list