[Mlir-commits] [mlir] [mlir] Add loaded URI attribute type (PR #67276)

Wed Sep 27 12:09:34 PDT 2023

jpienaar wrote:

> > Not sure I agree, I see this as no more specific than DenseElementsAttr.
> 
> There is history behind builtin, and it's pretty much admitted that we likely wouldn't have most of the things "builtin" if they were added in today's MLIR land. (we even looked into how to move tensor and memref to their own dialects!).
> 
> So the existence of something in builtin isn't a reason in itself to add more :)

True, but I see almost nothing dialect specific here. Even today when we landed ArrayAttr, it could have been anywhere, but it mostly made sense here in an independent manner. Now we could have a file or uri dialect :)

> 
> Right now this looks like a nice convenient tool for specific cases (looks almost like it could be an "example" of using resources somehow), but too far from something "core enough" to be there. Overall it's not clear to me that this is pulling its weight as an attribute in itself, instead of a more general mechanism (seems almost closer to an interface? But how do we handling the registrations of various loading mechanisms?).

Registry could be as simple as a static member on the attribute (I mean `DenseMap<StringRef, SmallVector<StringAttrStorage *>> dialectReferencingStrAttrs;` is on context, but not sure if we'd want to add more to context, seems per compilation as static member is fine). So here we'd have something like

```c++
vector<function_ref<std::optional<MemoryBuffer>(StringRef uri)> registry;
```

and

```c++
static LogicalResult LoadedURIElementsAttr::registerHandler(function_ref<std::optional<MemoryBuffer>(StringRef)> handler);
```

which is a switch on the schema of the URI and returns a MemoryBuffer (now one go really complicated to try and enable streaming in data over RPCs ... but that'll make any access variable cost which could be surprising). And so one iterates over attempting different matchers, stop when there is a match and proceed further. I decided (well ... decided ...) against a map of schema here as I don't think we'll actually have that many and it also allows for cases where the same handler could handle multiple schemas.

Now for the actual file decoding below, it just takes a MemoryBuffer, but there one would need a vector of file type handlers too.

The main reason I went for attribute is the uniqueing behavior, its just a ElementsAttr like any other (this is the interface of interest here for me) and one can use it different ways on different ops, while the backing storage is (ideally) just memory mapped from file and roundtrips "just as name". I considered making this a resource type instead, that wasn't as obvious how to get all the parts, but could revisit. This was also on the builtin dialect though, so doesn't remove that concern.

https://github.com/llvm/llvm-project/pull/67276