[Mlir-commits] [mlir] Add a polynomial dialect shell, attributes, and types (PR #72081)

Wed Nov 15 13:10:41 PST 2023

================
@@ -0,0 +1,217 @@
+//===- PolynomialAttributes.cpp - Polynomial dialect attributes --*- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+#include "mlir/Dialect/Polynomial/IR/PolynomialAttributes.h"
+
+#include "mlir/Dialect/Polynomial/IR/Polynomial.h"
+#include "mlir/Support/LLVM.h"
+#include "mlir/Support/LogicalResult.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+
+namespace mlir {
+namespace polynomial {
+
+void PolynomialAttr::print(AsmPrinter &p) const {
+  p << '<';
+  p << getPolynomial();
+  p << '>';
+}
+
+/// Try to parse a monomial. If successful, populate the fields of the outparam
+/// `monomial` with the results, and the `variable` outparam with the parsed
+/// variable name.
+ParseResult parseMonomial(AsmParser &parser, Monomial &monomial,
+                          llvm::StringRef *variable, bool *isConstantTerm) {
+  APInt parsedCoeff(apintBitWidth, 1);
+  auto result = parser.parseOptionalInteger(parsedCoeff);
+  if (result.has_value()) {
+    if (failed(*result)) {
+      parser.emitError(parser.getCurrentLocation(),
+                       "Invalid integer coefficient.");
+      return failure();
+    }
+  }
+
+  // Variable name
+  result = parser.parseOptionalKeyword(variable);
+  if (!result.has_value() || failed(*result)) {
----------------
j2kun wrote:

I tried rewriting this to make it less hacky in 6f175107f6937bb0c2497a80b3de73fa336d0c89, but without being able to look ahead in some manner, I feel it will remain hacky.

The complexity is that a monomial can have the following forms, separated by `+`:

- `4` : a degree-0 term (variable, exponent optional)
- `x`: a coefficient-1, degree-1 term (coefficient, exponent optional)
- `2x` : a degree-1 term (exponent optional)
- `x**5`: a coefficient-1 term (coefficient optional)
- `3x**5`: no parts omitted

Handling all of these possibilities is a bit of spaghetti because if a parse method succeeds, it consumes the token. The options I see to make it better are:

- Simplify the allowed syntax so that every monomial has the form `<coeff> <symbol>**<exponent>`, even if `exponent` is zero or `coeff` is 1. Then there's only one legal path.
- Somehow extend the parser so that I can attempt to parse the different allowed monomial forms in completely separate functions, and then if it fails "reset" to the previous parser location before trying the next one. Then I can work from the most specific form (no parts omitted) to the least specific (degree-0 monomials). E.g., a `parseFullMonomial` function that does the last item in the list above, then fall back to `parseMonomialMissingCoefficient`, then `parseMonomialMissingExponent`, etc.
- Provide a generalized comma-separated list parsing function to a "parse token-separated things" to handle the variadic nature.

https://github.com/llvm/llvm-project/pull/72081