[PATCH] D130516: [llvm] compression classes

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 29 13:26:20 PDT 2022


dblaikie added a comment.

In D130516#3688123 <https://reviews.llvm.org/D130516#3688123>, @MaskRay wrote:

> I'd like to make a few arguments for the current namespace+free function design, as opposed to the class+member function design as explored in this patch (but thanks for the exploration!).
> Let's discuss several use cases.
>
> (a) if a use case just calls compress/uncompress. The class design has slightly more boilerplate as it needs to get the algorithm class, a new instance, or a singleton instance.
> For each new use, the number of lines may not differ, but the involvement of a a static class member or an instance make the reader wonder whether the object will be reused or thrown away.
> There is some slight cognitive burden.
> The class design has a non-trivial one-shot cost to have a function returning the singleton instance.

Though there must've been a condition that dominates this use somewhere - I'd suggest that condition could be where the algorithm is retrieved, and then passed to this code to use unconditionally.

If the algorithm object is const and raw pointers/references are used, I think it makes it clear to the reader that there's no ownership here, and it's not stateful when compressing/decompressing.

> (b) zlib compress/uncompress immediately following an availability check.
>
>   // free function
>   if (!compression::zlib::isAvailable())
>     errs() << "cannot compress: " << compression::zlib::buildConfigurationHint();
>   
>   // class
>   auto *algo = !compression::ZlibCompression;
>   if (!algo->isAvailable()) {
>     errs() << "cannot compress: " << algo->buildConfigurationHint();
>   }

I think maybe this code might end up looking like:

  Algo *algo = getAlgo(Zlib)
  if (!algo)
    errs() ...

It's possible that this function would return non-null even for a non-available algorithm if we wanted to communicate other things (like the cmake macro name to enable to add the functionality)

> (c) zlib/zstd compress/uncompress immediately following an availability check.
>
>   // free function
>   if (!compression::isAvailable(format))
>     errs() << "cannot compress: " << compression::buildConfigurationHint(format);
>   
>   // class
>   std::unique_ptr<Compression> algo = make_compression(format);
>   if (!algo->isAvailable()) {
>     errs() << "cannot compress: " << algo->buildConfigurationHint();
>   }

I don't think there's a need for unique_ptr here - algorithms can be constant singletons, referenced via raw const pointers/references without ownership.

& this example doesn't include the code that does the compression/decompression,  which seems part of the discussion & part I find nice in that the type of compression used matches the type used in the check necessarily rather than being passed into two APIs independently.

> (d) compress/uncompress and an availability check are apart.
>
>   // free function
>   no change
>   
>   // class
>   Store (the pointer to the) the algorithm object somewhere, or construct the pointer/object twice.

The benefit here is that it's harder for the test to become separated from the usage - for the usage to end up becoming unconditional/incorrectly guarded.



================
Comment at: llvm/lib/Support/Compression.cpp:30-32
+ZStdCompressionAlgorithm
+    *llvm::compression::ZStdCompressionAlgorithm::Instance =
+        new ZStdCompressionAlgorithm();
----------------
leonardchan wrote:
> Perhaps for each of these, you could instead have something like:
> 
> ```
> ZStdCompressionAlgorithm *getZStdCompressionAlgorithm() {
>   static ZStdCompressionAlgorithm* instance = new ZStdCompressionAlgorithm;
>   return instance;
> }
> ```
> 
> This way the instances are only new'd when they're actually used.
Yep, I'd mentioned/suggested that (so, seconding here) elsewhere encouraging these to be singletons: https://reviews.llvm.org/D130516#3683384

And they don't even need to be 'new'd in that case, this would be fine:
```
ZstdCompressionAlgorithm &getZstdCompressionAlgorithm() {
  static ZstdCompressionAlgorithm C;
  return C;
}
```

Though I think maybe we don't need individual access to the algorithms, and it'd be fine to have only a single entry point like this:
```
CompressionAlgorithm *getCompressionAlgorithm(DebugCompressionType T) {
  switch (T) {
  case DebugCompressionType::ZStd: {
    static zstd::CompressionAlgorithm Zstd;
    if (zstd::isAvailable())
      return &Zstd;
  }
  ...
  }
  return nullptr;
}
```
(or, possibly, we want to return non-null even if it isn't available, if we include other things (like the configure macro name - so callers can use that name to print helpful error messages - but then they have to explicitly check if the algorithm is available after the call))


================
Comment at: llvm/lib/Support/Compression.cpp:195-199
+constexpr SupportCompressionType ZStdCompressionAlgorithm::AlgorithmId;
+constexpr StringRef ZStdCompressionAlgorithm::Name;
+constexpr int ZStdCompressionAlgorithm::BestSpeedCompression;
+constexpr int ZStdCompressionAlgorithm::DefaultCompression;
+constexpr int ZStdCompressionAlgorithm::BestSizeCompression;
----------------
I don't think there's particular value in these being constexpr members - and maybe we don't need these at all just yet/could leave them out for now? It'd be great to reduce this whole patch to something more comparable with https://reviews.llvm.org/D130458

If you have plans for these other properties it might be helpful to understand what they are - they might help inform the design discussion.

(if we are keeping tnhese properties, including the string version of the name, etc - I'd think the way to do it would be for the base algorithm class to have non-static members to store these, and derived algorithm classes to pass the values into the base ctor to be stored in the members - they could even be const public members of the algorithm to be accessed directly, rather than via accessor functions (& certainly not virtual accessor functions))


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130516/new/

https://reviews.llvm.org/D130516



More information about the llvm-commits mailing list