[PATCH] D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 1 17:04:23 PST 2022


dblaikie added inline comments.


================
Comment at: llvm/include/llvm/CAS/CASID.h:26
+class CASContext {
+  virtual void anchor();
+
----------------
This could be private to avoid polluting autocomplete and such, maybe? (alternatively it's probably not too costly to have the dtor virtual and out of line to act as an anchor - I assume CASContexts aren't being created/torn down with any great frequency such that some dtor indirection would be especially costly?)


================
Comment at: llvm/include/llvm/CAS/CASID.h:108-109
+private:
+  CASID(const CASContext *Context, StringRef Hash)
+      : Context(Context), Hash(Hash) {}
+
----------------
if you have a `std::string` member, should probably take the parameter as `std::string` and `std::move` it into the member - in cases where the caller can discard a string that'll be more efficient, and in cases where the caller can't, it's not especially worse.

(though shoudl the member be `std::string`? should it be `SmallString` or anything else?)


================
Comment at: llvm/include/llvm/CAS/CASReference.h:62
+    assert(
+        (isDenseMapSentinel() || RHS.isDenseMapSentinel() || CAS == RHS.CAS) &&
+        "Cannot compare across CAS instances");
----------------
Only the last of these checks is limited to `LLVM_ENABLE_ABI_BREAKING_CHECKS` yeah? Might be worth pulling the other two out of the preprocessor conditional, so they get run more broadly?


================
Comment at: llvm/include/llvm/CAS/CASReference.h:106
+/// ObjectHandle, a variant that knows what kind of entity it is. \a
+/// ObjectStore::getReferenceKind() can expect the type of reference without
+/// asking for unloaded objects to be loaded.
----------------
expect -> extract? or something else?


================
Comment at: llvm/include/llvm/CAS/CASReference.h:110-113
+/// assertions are on). If necessary, it can be deconstructed and reconstructed
+/// using \a Reference::getInternalRef() and \a
+/// Reference::getFromInternalRef(), but clients aren't expected to need to do
+/// this. These both require the right \a ObjectStore instance.
----------------
If clients aren't expected to need to do it, could we leave that out until some necessary use comes up?


================
Comment at: llvm/include/llvm/CAS/CASReference.h:162
+/// data and references.
+class ObjectHandle : public ReferenceBase {
+public:
----------------
Since `Handle` and `Ref` aren't especially self documenting (while reading the docs for `Ref` I wondered if it should be called `Handle` - until I got to the bit where it explained that that exists too/separately) - is there something that'd be more explicit about the difference between these? (like, I wonder if `ObjectHandle` could be `Object`? - not sure that'd make `ObjectRef` more obvious, though... )


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:49
+///     - \a loadObject(const CASID&) looks up an object by its UID.
+/// - Objects can reference other objects, forming an arbitrary DAG.
+///
----------------
do they "reference" other objects - or does an object consist of other objects/include those objects in some sense? (maybe there's no real difference, and this current language is better... not sure)


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:53-57
+/// - \a ObjectRef encapsulates a reference to something in the CAS. It is an
+///   opaque type that references an object inside a specific CAS. It is
+///   implementation defined if the underlying object exists or not for an
+///   ObjectRef, and it can used to speed up CAS lookup as an implementation
+///   detail. However, you don't know anything about the underlying objects.
----------------
Not clear from this what the difference between an `ObjectRef` and a `CASID` are - is that worth more details?


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:61-64
+/// - \a ObjectHandle encapulates a *loaded* object in the CAS. You need one of
+///   these to inspect the content of an object: to look at its stored
+///   data and references. This is internal to CAS implementation and not
+///   availble from CAS public APIs.
----------------
Ah. May be worth splitting up the documentation here more clearly between public and private concepts, rather than having this internal detail interleaved with external ones?


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:90
+  friend class ReferenceBase;
+  void anchor();
+
----------------
anchor's only useful if it's virtual?
Maybe jus tmake the dtor out of line and use that as an anchor?


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:109-110
+
+  /// Validate the underlying object referred by CASID.
+  virtual Error validate(const CASID &ID) = 0;
+
----------------
what sort of validity is of concern here?


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:174-175
+  /// Methods for handling objects.
+  virtual Error forEachRef(ObjectHandle Node,
+                           function_ref<Error(ObjectRef)> Callback) const = 0;
+  virtual ObjectRef readRef(ObjectHandle Node, size_t I) const = 0;
----------------
this lets you walk which other objects reference this object?
I guess you've probably got some pretty core use cases/need for this - but dealing with updating use lists in LLVM IR at least makes me ask: Do you need to be able to walk the uses of an object?


================
Comment at: llvm/lib/CAS/InMemoryCAS.cpp:262
+
+ArrayRef<char> InMemoryObject::getData() const {
+  if (auto *Derived = dyn_cast<InMemoryRefObject>(this))
----------------
I'd probably put the `inline` keyword here and leave it off the declaration (otherwise I'd read this code and wonder how it's valid/doesn't produce duplicate definitions) - wonder what's more common in the LLVM project/codebase. No big deal either way, though.


================
Comment at: llvm/unittests/CAS/ObjectStoreTest.cpp:161
+      "word",
+      "some longer text std::string's local memory",
+      R"(multiline text multiline text multiline text multiline text
----------------
not sure I understand the mention of std::string here - since this is `StringRef`? (maybe this is some remnant of a different version of the code)


================
Comment at: llvm/unittests/CAS/ObjectStoreTest.cpp:173-175
+    // Use StringRef::str() to create a temporary std::string. This could cause
+    // problems if the CAS is storing references to the input string instead of
+    // copying it.
----------------
Rather than testing this via UB, could it be tested via pointer equality? (eg: when querying the CAS, check that the data points to the same place as the data pasesd in)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133716/new/

https://reviews.llvm.org/D133716



More information about the llvm-commits mailing list