[PATCH] D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 7 15:30:53 PST 2022


dblaikie added inline comments.


================
Comment at: llvm/include/llvm/CAS/CASID.h:108-109
+private:
+  CASID(const CASContext *Context, StringRef Hash)
+      : Context(Context), Hash(Hash) {}
+
----------------
steven_wu wrote:
> dblaikie wrote:
> > if you have a `std::string` member, should probably take the parameter as `std::string` and `std::move` it into the member - in cases where the caller can discard a string that'll be more efficient, and in cases where the caller can't, it's not especially worse.
> > 
> > (though shoudl the member be `std::string`? should it be `SmallString` or anything else?)
> I don't have much preference for the internal data type. CASID should only be relevant when you actually need the hash representation, often when you need to print out the value for outside users. Otherwise, `ObjectRef` is always a better choice.
Cool - I see you updated the parameter to `std::string&&` - generally I'd expect it to be `std::string` value, not a reference. That way the caller can move into the parameter if they have a string to discard, or allow the copy if they want to keep copies in the caller for whatever reason.


================
Comment at: llvm/include/llvm/CAS/CASReference.h:162
+/// data and references.
+class ObjectHandle : public ReferenceBase {
+public:
----------------
steven_wu wrote:
> dblaikie wrote:
> > Since `Handle` and `Ref` aren't especially self documenting (while reading the docs for `Ref` I wondered if it should be called `Handle` - until I got to the bit where it explained that that exists too/separately) - is there something that'd be more explicit about the difference between these? (like, I wonder if `ObjectHandle` could be `Object`? - not sure that'd make `ObjectRef` more obvious, though... )
> `ObjectRef` is just a reference, and you can't access anything in the object it points to, until you load the object using ObjectRef, which will turn into an `ObjectHandle`. The load can have latency or can fail.
> 
> Maybe the comments in ObjectStore.h is a better explanation with a bigger picture? Or should we put all the documentation into one place so it is easier to read?
Still feels like the names could be more self-descriptive, even with/independent of documentation improvements. `Ref` and `Handle` both tend to connote a thing you can use to inspect the thing being referred to or handled. (if anything, I guess I'd actually expect a Handle to be the inaccessible version and the Ref to be the accessible version (thinking StringRef, ArrayRef, etc - where the data is directly accessible through that abstraction, whereas handles sometimes you have to pass back into some other object/API to then get the data (like a file handle/file descriptor)))

Seems like ObjectRef is maybe more suitably called ObjectID?


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:59
+///   "Loading" the object is a separate step that may not have happened
+///   yet, and which can fail (e.g. due to filesystem corruption) or introduce
+///   latency (if downloading from a remote store).
----------------
Presumably it can also fail if the object isn't in the given CAS? (maybe that's the more obvious/simpler to document example?)


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:61-64
+/// - \a ObjectHandle encapulates a *loaded* object in the CAS. You need one of
+///   these to inspect the content of an object: to look at its stored
+///   data and references. This is internal to CAS implementation and not
+///   availble from CAS public APIs.
----------------
dblaikie wrote:
> Ah. May be worth splitting up the documentation here more clearly between public and private concepts, rather than having this internal detail interleaved with external ones?
ping on this


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:109-110
+
+  /// Validate the underlying object referred by CASID.
+  virtual Error validate(const CASID &ID) = 0;
+
----------------
steven_wu wrote:
> dblaikie wrote:
> > what sort of validity is of concern here?
> This is more a debugging function that can check if the integrity of the CAS has been broken. The current implementation for builtin CAS is to rehash the object fetched from CASID and make sure the hash matches. If not, there is either a bug in CAS or the data has been corrupted.
Might be worth a few more words in the description maybe about "consistency" or somesuch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133716/new/

https://reviews.llvm.org/D133716



More information about the llvm-commits mailing list