[flang] [llvm] [Flang][OpenMP] Fix mapping of character type with LEN > 1 specified (PR #154172)

Tue Sep 9 06:46:31 PDT 2025

================
@@ -69,13 +84,78 @@ struct MapInfoOpConversion
       return mlir::failure();
 
     llvm::SmallVector<mlir::NamedAttribute> newAttrs;
-    mlir::omp::MapInfoOp newOp;
+    mlir::omp::MapBoundsOp mapBoundsOp;
     for (mlir::NamedAttribute attr : curOp->getAttrs()) {
       if (auto typeAttr = mlir::dyn_cast<mlir::TypeAttr>(attr.getValue())) {
         mlir::Type newAttr;
         if (fir::isTypeWithDescriptor(typeAttr.getValue())) {
           newAttr = lowerTy().convertBoxTypeAsStruct(
               mlir::cast<fir::BaseBoxType>(typeAttr.getValue()));
+        } else if (fir::isa_char_string(fir::unwrapSequenceType(
+                       fir::unwrapPassByRefType(typeAttr.getValue()))) &&
+                   !characterWithDynamicLen(
+                       fir::unwrapPassByRefType(typeAttr.getValue()))) {
+          // Characters with a LEN param are represented as strings
+          // (array of characters), the lowering to LLVM dialect
+          // doesn't generate bounds for these (and this is not
+          // done at the initial lowering either) and there is
+          // minor inconsistencies in the variable types we
+          // create for the map without this step when converting
+          // to the LLVM dialect.
+          //
+          // For example, given the types:
+          //
+          //  1) CHARACTER(LEN=16), dimension(:,:), allocatable :: char_arr
+          //  2) CHARACTER(LEN=16), dimension(10,10) :: char_arr
+          //
+          // We get the FIR types (note for 1: we already peeled off the
+          // dynamic extents from the type at this stage, but the conversion
+          // to llvm dialect does that in any case, so the final result
+          // is the same):
+          //
+          //  1) !fir.char<1,16>
+          //  2) !fir.array<10x10x!fir.char<1,16>>
+          //
+          // Which are converted to the LLVM dialect types:
+          //
+          // 1) !llvm.array<16 x i8>
+          // 2) llvm.array<10 x array<10 x array<16 x i8>>
+          //
+          // And in both cases, we are missing the innermost bounds for
+          // the !fir.char<1,16> which is expanded into a 16 x i8 array
+          // in the conversion to LLVM dialect.
+          //
+          // The problem with this is that we would like to treat these
+          // cases identically and not have to create specialised
+          // lowerings for either of these in the lowering to LLVM-IR
+          // and treat them like any other array that passes through.
+          //
+          // To do so below, we generate an extra bound for the
+          // innermost array (the char type/string) using the LEN
+          // parameter of the character type. And we "canonicalize"
+          // the type, stripping it down to the base element type,
+          // which in this case is an i8. This effectively allows
+          // the lowering to treat this as a 1-D array with multiple
+          // bounds which it is capable of handling without any special
+          // casing.
+          // TODO: Handle dynamic LEN characters.
+          if (auto ct = mlir::dyn_cast_or_null<fir::CharacterType>(
+                  fir::unwrapSequenceType(typeAttr.getValue()))) {
+            newAttr = converter->convertType(
+                fir::unwrapSequenceType(typeAttr.getValue()));
+            if (auto type = mlir::dyn_cast<mlir::LLVM::LLVMArrayType>(newAttr))
+              newAttr = type.getElementType();
+            // We do not generate for device, as MapBoundsOps are
+            // unsupported, as they're currently unused.
----------------
agozillon wrote:

Both, we omit a lot of unused host operations from the device MLIR now and this is one of those :-) Thank you for pointing it out, I'll try to rephrase it better before I land this PR!

https://github.com/llvm/llvm-project/pull/154172