[PATCH] D130516: [llvm] compression classes
David Blaikie via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Jul 29 13:26:20 PDT 2022
dblaikie added a comment.
In D130516#3688123 <https://reviews.llvm.org/D130516#3688123>, @MaskRay wrote:
> I'd like to make a few arguments for the current namespace+free function design, as opposed to the class+member function design as explored in this patch (but thanks for the exploration!).
> Let's discuss several use cases.
>
> (a) if a use case just calls compress/uncompress. The class design has slightly more boilerplate as it needs to get the algorithm class, a new instance, or a singleton instance.
> For each new use, the number of lines may not differ, but the involvement of a a static class member or an instance make the reader wonder whether the object will be reused or thrown away.
> There is some slight cognitive burden.
> The class design has a non-trivial one-shot cost to have a function returning the singleton instance.
Though there must've been a condition that dominates this use somewhere - I'd suggest that condition could be where the algorithm is retrieved, and then passed to this code to use unconditionally.
If the algorithm object is const and raw pointers/references are used, I think it makes it clear to the reader that there's no ownership here, and it's not stateful when compressing/decompressing.
> (b) zlib compress/uncompress immediately following an availability check.
>
> // free function
> if (!compression::zlib::isAvailable())
> errs() << "cannot compress: " << compression::zlib::buildConfigurationHint();
>
> // class
> auto *algo = !compression::ZlibCompression;
> if (!algo->isAvailable()) {
> errs() << "cannot compress: " << algo->buildConfigurationHint();
> }
I think maybe this code might end up looking like:
Algo *algo = getAlgo(Zlib)
if (!algo)
errs() ...
It's possible that this function would return non-null even for a non-available algorithm if we wanted to communicate other things (like the cmake macro name to enable to add the functionality)
> (c) zlib/zstd compress/uncompress immediately following an availability check.
>
> // free function
> if (!compression::isAvailable(format))
> errs() << "cannot compress: " << compression::buildConfigurationHint(format);
>
> // class
> std::unique_ptr<Compression> algo = make_compression(format);
> if (!algo->isAvailable()) {
> errs() << "cannot compress: " << algo->buildConfigurationHint();
> }
I don't think there's a need for unique_ptr here - algorithms can be constant singletons, referenced via raw const pointers/references without ownership.
& this example doesn't include the code that does the compression/decompression, which seems part of the discussion & part I find nice in that the type of compression used matches the type used in the check necessarily rather than being passed into two APIs independently.
> (d) compress/uncompress and an availability check are apart.
>
> // free function
> no change
>
> // class
> Store (the pointer to the) the algorithm object somewhere, or construct the pointer/object twice.
The benefit here is that it's harder for the test to become separated from the usage - for the usage to end up becoming unconditional/incorrectly guarded.
================
Comment at: llvm/lib/Support/Compression.cpp:30-32
+ZStdCompressionAlgorithm
+ *llvm::compression::ZStdCompressionAlgorithm::Instance =
+ new ZStdCompressionAlgorithm();
----------------
leonardchan wrote:
> Perhaps for each of these, you could instead have something like:
>
> ```
> ZStdCompressionAlgorithm *getZStdCompressionAlgorithm() {
> static ZStdCompressionAlgorithm* instance = new ZStdCompressionAlgorithm;
> return instance;
> }
> ```
>
> This way the instances are only new'd when they're actually used.
Yep, I'd mentioned/suggested that (so, seconding here) elsewhere encouraging these to be singletons: https://reviews.llvm.org/D130516#3683384
And they don't even need to be 'new'd in that case, this would be fine:
```
ZstdCompressionAlgorithm &getZstdCompressionAlgorithm() {
static ZstdCompressionAlgorithm C;
return C;
}
```
Though I think maybe we don't need individual access to the algorithms, and it'd be fine to have only a single entry point like this:
```
CompressionAlgorithm *getCompressionAlgorithm(DebugCompressionType T) {
switch (T) {
case DebugCompressionType::ZStd: {
static zstd::CompressionAlgorithm Zstd;
if (zstd::isAvailable())
return &Zstd;
}
...
}
return nullptr;
}
```
(or, possibly, we want to return non-null even if it isn't available, if we include other things (like the configure macro name - so callers can use that name to print helpful error messages - but then they have to explicitly check if the algorithm is available after the call))
================
Comment at: llvm/lib/Support/Compression.cpp:195-199
+constexpr SupportCompressionType ZStdCompressionAlgorithm::AlgorithmId;
+constexpr StringRef ZStdCompressionAlgorithm::Name;
+constexpr int ZStdCompressionAlgorithm::BestSpeedCompression;
+constexpr int ZStdCompressionAlgorithm::DefaultCompression;
+constexpr int ZStdCompressionAlgorithm::BestSizeCompression;
----------------
I don't think there's particular value in these being constexpr members - and maybe we don't need these at all just yet/could leave them out for now? It'd be great to reduce this whole patch to something more comparable with https://reviews.llvm.org/D130458
If you have plans for these other properties it might be helpful to understand what they are - they might help inform the design discussion.
(if we are keeping tnhese properties, including the string version of the name, etc - I'd think the way to do it would be for the base algorithm class to have non-static members to store these, and derived algorithm classes to pass the values into the base ctor to be stored in the members - they could even be const public members of the algorithm to be accessed directly, rather than via accessor functions (& certainly not virtual accessor functions))
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130516/new/
https://reviews.llvm.org/D130516
More information about the cfe-commits
mailing list