[Mlir-commits] [mlir] Fix bytecode roundtrip of unregistered ops (PR #82932)

Sun Feb 25 11:55:18 PST 2024

https://github.com/mfrancio created https://github.com/llvm/llvm-project/pull/82932

When roundtripping to bytecode an unregistered operation name that does
not contain any '.' separator, the bytecode writer will emit an op
encoding without a proper opName. In this case, the string just becomes
a possibly unknown dialect name. At parsing, this dialect name is
used as a proper operation name.

However, when the unregistered operation name coincidentally matches
that of a dialect, the parser would fail. That means we can't roundtrip
an unregistered op with a name that matches one of the registered
dialect names. For example,

```
"index"() : () -> ()
```

can be emitted but cannot be parsed, because its name is coincidentally
the same as that of the Index dialect. The patch removes such
inconsistency.

This patch specifically fixes the bytecode roundtrip of
`mlir/test/IR/parser.mlir`.

>From 270429d9c8bf97705dfc09717961ad11f2123171 Mon Sep 17 00:00:00 2001
From: Matteo Franciolini <mfranciolini at tesla.com>
Date: Mon, 19 Feb 2024 16:13:28 -0800
Subject: [PATCH] Fix bytecode roundtrip of unregistered ops

When roundtripping to bytecode an unregistered operation name that does
not contain any '.' separator, the bytecode writer will emit an op
encoding without a proper opName. In this case, the string just becomes
a possibly unknown dialect name. At parsing, this dialect name is
used as a proper operation name.

However, when the unregistered operation name coincidentally matches
that of a dialect, the parser would fail. That means we can't roundtrip
an unregistered op with a name that matches one of the registered
dialect names. For example,

```
"index"() : () -> ()
```

can be emitted but cannot be parsed, because its name is coincidentally
the same as that of the Index dialect. The patch removes such
inconsistency.

This patch specifically fixes the bytecode roundtrip of
`mlir/test/IR/parser.mlir`.
---
 mlir/lib/Bytecode/Reader/BytecodeReader.cpp | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
index d61634062784c6..dd1e4abaea1664 100644
--- a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+++ b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
@@ -1854,22 +1854,18 @@ BytecodeReader::Impl::parseOpName(EncodingReader &reader,
   // Check to see if this operation name has already been resolved. If we
   // haven't, load the dialect and build the operation name.
   if (!opName->opName) {
-    // Load the dialect and its version.
-    DialectReader dialectReader(attrTypeReader, stringReader, resourceReader,
-                                dialectsMap, reader, version);
-    if (failed(opName->dialect->load(dialectReader, getContext())))
-      return failure();
     // If the opName is empty, this is because we use to accept names such as
     // `foo` without any `.` separator. We shouldn't tolerate this in textual
     // format anymore but for now we'll be backward compatible. This can only
     // happen with unregistered dialects.
     if (opName->name.empty()) {
-      if (opName->dialect->getLoadedDialect())
-        return emitError(fileLoc) << "has an empty opname for dialect '"
-                                  << opName->dialect->name << "'\n";
-
       opName->opName.emplace(opName->dialect->name, getContext());
     } else {
+      // Load the dialect and its version.
+      DialectReader dialectReader(attrTypeReader, stringReader, resourceReader,
+                                  dialectsMap, reader, version);
+      if (failed(opName->dialect->load(dialectReader, getContext())))
+        return failure();
       opName->opName.emplace((opName->dialect->name + "." + opName->name).str(),
                              getContext());
     }