[PATCH] D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation

Steven Wu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 8 09:50:40 PST 2022


steven_wu added inline comments.


================
Comment at: llvm/include/llvm/CAS/CASReference.h:162
+/// data and references.
+class ObjectHandle : public ReferenceBase {
+public:
----------------
dblaikie wrote:
> steven_wu wrote:
> > dblaikie wrote:
> > > Since `Handle` and `Ref` aren't especially self documenting (while reading the docs for `Ref` I wondered if it should be called `Handle` - until I got to the bit where it explained that that exists too/separately) - is there something that'd be more explicit about the difference between these? (like, I wonder if `ObjectHandle` could be `Object`? - not sure that'd make `ObjectRef` more obvious, though... )
> > `ObjectRef` is just a reference, and you can't access anything in the object it points to, until you load the object using ObjectRef, which will turn into an `ObjectHandle`. The load can have latency or can fail.
> > 
> > Maybe the comments in ObjectStore.h is a better explanation with a bigger picture? Or should we put all the documentation into one place so it is easier to read?
> Still feels like the names could be more self-descriptive, even with/independent of documentation improvements. `Ref` and `Handle` both tend to connote a thing you can use to inspect the thing being referred to or handled. (if anything, I guess I'd actually expect a Handle to be the inaccessible version and the Ref to be the accessible version (thinking StringRef, ArrayRef, etc - where the data is directly accessible through that abstraction, whereas handles sometimes you have to pass back into some other object/API to then get the data (like a file handle/file descriptor)))
> 
> Seems like ObjectRef is maybe more suitably called ObjectID?
I don't have strong opinion on names. Maybe @dexonsmith @benlangmuir @akyrtzi can also provide some feedbacks for names?

In my opinion, `ObjectRef` is named that way because it is like a reference type (just an opaque pointer) and you need to "dereference" it to access underlying data, just that the `dereference` might fail in CAS. The difference between `ObjectRef` and `ObjectHandle` is mostly providing the flexibility for a remote CAS so there is no need to traffic all the data if you only dealing with refs.

Also `ObjectRef` is the public type we encourage to use, just like other `Ref` types like `StringRef` and `ArrayRef`. It is cheap and allows quick fetch of the underlying data (at least for the builtin CAS we provided). It doesn't really contains ID, and you can't compare `Ref` from different ObjectStore.

I thought about rename `CASID` to `ObjectID` to keep the name consistent, but there is no other more benefit. I am up for renaming types to make it more self-descriptive but I would like to reach an agreement before doing so because we have lots of downstream code needs to upstream and I would like to avoid repeated renames. 


================
Comment at: llvm/include/llvm/CAS/ObjectStore.h:59
+///   "Loading" the object is a separate step that may not have happened
+///   yet, and which can fail (e.g. due to filesystem corruption) or introduce
+///   latency (if downloading from a remote store).
----------------
dblaikie wrote:
> Presumably it can also fail if the object isn't in the given CAS? (maybe that's the more obvious/simpler to document example?)
This question is complicated. Maybe we should write down a spec for the CAS APIs? To this question, the current spec for when to lookup for a CAS object to make sure it exists is totally implementation defined.

For builtinCAS here, ObjectRef always points to existing object, unless the integrity of the builtinCAS is broken. For a remote CAS, you might actually want to have ObjectRef to be unverified to avoid a roundtrip to remote to validate a ObjectRef. 

Also to you previous point of split up public/private interface document, is it better if I create a separate docs in `llvm/docs` to explain different concepts from the views of: 1. CAS users 2. CAS implementors?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133716/new/

https://reviews.llvm.org/D133716



More information about the llvm-commits mailing list