[PATCH] D68570: Unify the two CRC implementations

Wed Oct 9 01:33:47 PDT 2019

hans marked 4 inline comments as done.
hans added a comment.

In D68570#1700024 <https://reviews.llvm.org/D68570#1700024>, @rupprecht wrote:

> In D68570#1699204 <https://reviews.llvm.org/D68570#1699204>, @hans wrote:
>
> > In D68570#1697872 <https://reviews.llvm.org/D68570#1697872>, @rupprecht wrote:
> >
> > > Do you have any benchmarks?
> >
> >
> > Of what? That generating the table is slower than just using it directly? I'd say no benchmark is needed to conclude that :-)
>
>
> There is also the cost of code complexity to consider. Sure, the table generation goes away, but does that even show up on any benchmark/real world usage of this? Nobody's exercising the table generation code. People are either check-summing once (in which case, it doesn't matter as much if it's slow, because it happens once) or check-summing 1000 times (in which case the one-time table generation is probably not the CRC bottleneck). If there's no performance win, is it worth the potential code complexity of an opaque hex array over constructing it with an algorithm that can be inspected?

You're right that it probably doesn't matter much, it's more the principle that it's wasteful to compute it at runtime.

I think maybe our disagreement comes from that I actually see including the table directly as *less* code complexity than the way we currently generate it. It's a code pattern for computing CRC-32. If you search the web for how to compute CRC-32, this table shows up. Some specs even define CRC-32 in terms of the table (e.g. https://docs.microsoft.com/en-us/openspecs/office_protocols/ms-abs/06966aa2-70da-4bf9-8448-3355f277cd77).

I think the code to generate the table (and to use it) is just as opaque. Where did the current code come from? It's from https://github.com/libarchive/libarchive/blob/master/libarchive/archive_crc32.h How did it work? There are no comments.

It's a fancy algorithm and there's no way around reading some external sources to figure out how it works. Using the pre-computed table directly removes some unnecessary work and in my opinion it's clearer than the previous code.

> FWIW I'm still in favor of this patch, and the objcopy part that Herald added me for LGTM. Addressing my comments would help me understand this code but feel free to ignore if I'm the only one that feels this way.

I appreciate your comments and I hope I've addressed them satisfactorily.

In D68570#1700080 <https://reviews.llvm.org/D68570#1700080>, @rnk wrote:

> Maybe a dumb idea: can we compute the table with constexpr evaluation? You could set up a constexpr function that returns a struct that wraps the array, and then the body of the function would construct the array imperatively as the old initialization code did. Set up a constexpr global with an initializer that calls the function.

This sounds fun, but I don't think it would be a step towards less code complexity :-)

> If not all compilers support this, we could instead use static_assert to check that the table is correct under an `#ifdef __clang__`, or whatever conditions apply.
> 
> One possible drawback of this approach is compile time. If that ends up mattering, it could be ifdef EXPENSIVE_CHECKS or something awful like that.
> 
> Another drawback is that I have seen MSVC accept globals marked constexpr, but then sometimes they get dynamic initializers anyway. The static_assert approach might be safer.

I've extended the unit test to exercise the whole table. That should give us the benefit of static_assert'ing that it's correct, without the code complexity or compile time.

================
Comment at: llvm/lib/Support/CRC.cpp:26

-uint32_t llvm::crc32(uint32_t CRC, StringRef S) {
-  static llvm::once_flag InitFlag;
-  static CRC32Table Tbl;
-  llvm::call_once(InitFlag, initCRC32Table, &Tbl);
+static const uint32_t CRCTable[256] = {
+    0x00000000, 0x77073096, 0xee0e612c, 0x990951ba,
----------------
rupprecht wrote:
> hans wrote:
> > hiraditya wrote:
> > > rupprecht wrote:
> > > > Can you leave a comment how this table was generated/how it could be regenerated if needed in the future? And/or a unit test to assert the values are correct?
> > > +1
> > I'm adding a comment with references for how the algorithm and the table works. There is already a unit test in llvm/unittests/Support/CRCTest.cpp
> From what I can tell:
> - This algorithm makes use of one byte from this table per byte in the input array
> - The table has 256 entries
> - The test only includes "The quick brown fox jumps over the lazy dog" and "123456789" (43 total values if I'm counting correctly)
> There's no way that test sufficiently tests that all 256 values here are correct. Maybe check summing a very large (>100k) buffer would give confidence that -- probably -- all values are being used.
> 
> I think the unit test is sufficient for testing a generated table because it's hard to mess up the generation in a way that only affects some values. However with a hard-coded table, it's much more likely that a single value here could be wrong due to mechanical issues (e.g. accidentally changing a character in the process of formatting it).
That's a fair point. I've added a test that exercises the whole table.

================
Comment at: llvm/lib/Support/CRC.cpp:29-37
-  for (size_t I = 0; I < Tbl->size(); ++I) {
-    uint32_t V = Shuffle(I);
-    V = Shuffle(V);
-    V = Shuffle(V);
-    V = Shuffle(V);
-    V = Shuffle(V);
-    V = Shuffle(V);
----------------
rupprecht wrote:
> I'd personally find keeping this as a comment to be more useful in understanding how the table is generated than digging up papers
But to understand this code, one would need to dig up papers too.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68570/new/

https://reviews.llvm.org/D68570