[PATCH] D130516: [llvm] compression classes

David Blaikie via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 5 17:13:10 PDT 2022

dblaikie added a comment.

This is looking pretty close to what I've been picturing - the only thing remaining is that I think we could get rid of `CompressionKind` and access the `CompressionAlgorithm` directly - basically the contents of `CompressionKind::operator->` could be a free/public function `const CompressionAlgorithm &getCompressionAlgorithm(enum class CompressionKind);` - and have it return a reference to the local static implementation, with a none implementation (for those profile cases where you want to pass in an algorithm if it's available, or none) and one implementation for each of zlib/zstd?

Comment at: clang-tools-extra/clangd/index/Serialization.cpp:238
+      constexpr int MaxCompressionRatio = 1032;
+      if ((CompressionScheme == CompressionKind::Zlib) &&
+          UncompressedSize / MaxCompressionRatio > R.rest().size())
What purpose is this expression serving? (isn't it a tautology, given that `CompressionScheme` was initialized a couple of lines back with `CompressionKind::Zlib`?

Comment at: clang/lib/Serialization/ASTWriter.cpp:2003-2004
   // consumers will not want its contents.
-  SmallVector<uint8_t, 0> CompressedBuffer;
-  if (llvm::compression::zlib::isAvailable()) {
-    llvm::compression::zlib::compress(
-        llvm::arrayRefFromStringRef(Blob.drop_back(1)), CompressedBuffer);
+  CompressionKind CompressionScheme = CompressionKind::Zlib;
+  if (CompressionScheme) {
+    SmallVector<uint8_t, 0> CompressedBuffer;
Generally LLVM style rolls these together

Comment at: llvm/lib/Object/Decompressor.cpp:41-47
+  uint64_t ELFCompressionSchemeId = Extractor.getUnsigned(
+      &Offset, Is64Bit ? sizeof(Elf64_Word) : sizeof(Elf32_Word));
+  if (ELFCompressionSchemeId == ELFCOMPRESS_ZLIB) {
+    CompressionScheme = compression::CompressionKind::Zlib;
+  } else {
     return createError("unsupported compression type");
+  }
Maybe leave this code more like it was before - it can turn into a switch over `ELFCompressionSchemeId` when Zstd is added here.

Comment at: llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp:122-132
+    if (!compression::CompressionKind::Zlib)
       return make_error<CoverageMapError>(
     // Allocate memory for the decompressed filenames.
     SmallVector<uint8_t, 0> StorageBuf;
Be nice to share the same CompressionKind

Comment at: llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp:51-58
+  compression::OptionalCompressionKind OptionalCompressionScheme =
+      compression::CompressionKind::Zlib;
+  bool DoCompression = OptionalCompressionScheme && *OptionalCompressionScheme;
+  if (DoCompression) {
+    compression::CompressionKind CompressionScheme = *OptionalCompressionScheme;
Why/how does `OptionalCompressionKind` end up in here, compared with this (suggested edit)?

Comment at: llvm/lib/ProfileData/InstrProf.cpp:465
-  if (!doCompression) {
+  if ((!OptionalCompressionScheme) || (!(*OptionalCompressionScheme)))
     return WriteStringToResult(0, UncompressedNameStrings);
Yeah, this seems awkward, but I see what you're getting at - if you're going to pass around an algorithm to use, and this particular kind of use case wants to collapse the "algorithm not available" and "no compression was requested" - so, yeah, for this sort of use case I can see the valid in having a null compression algorithm implementation. That shouldn't add a lot of complexity to the implementation and simplify the usage so it doesn't have two layers of "not present" state.

Comment at: llvm/lib/ProfileData/InstrProf.cpp:492
+  return collectPGOFuncNameStrings(NameStrs,
+                                   doCompression && OptionalCompressionScheme
+                                       ? OptionalCompressionScheme
Presumably this doesn't need to test `OptionalCompressionScheme` here, though, since it's tested in the implementation?

Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:881-886
+  if (!CompressionKind::Zlib)
     return sampleprof_error::zlib_unavailable;
   uint8_t *Buffer = Allocator.Allocate<uint8_t>(DecompressBufSize);
   size_t UCSize = DecompressBufSize;
+  llvm::Error E = CompressionKind::Zlib->decompress(
Nice to pull out a common variable rather than accessing `CompressionKind::Zlib` twice independently (otherwise we probably might as well go with the more direct API @MaskRay has proposed)

  rG LLVM Github Monorepo



More information about the cfe-commits mailing list