[PATCH] D78796: [Support] Refactor LEB128 encoding into an input iterator
Daniel Sanders via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu May 7 11:55:18 PDT 2020
dsanders added a comment.
I'm inclined to agree that the patch series as-is doesn't really warrant the iterators as the interface as no callers have been updated. However, I also don't see much that's iterator specific (ULEBifier would be roughly similar code leaving the iterator portion as trivial wrappers on the ULEBifier) and there are a few places (particularly in tablegen) that are emitting LEB's into containers where the loop to add bytes one by one is just noise and something like `std::copy(to_uleb(...), std::back_inserter(Table));` would be somewhat nice. The loop could easily be hidden in something like `append_uleb(Table, ...)` though I don't think there's a strong argument for (or against) iterators.
What I would suggest is separating out the byte-sequence generation into a ULEBifier as David suggested but still keeping the iterator object as a thin adapter (effectively implementing that while loop) to support inter-operation with STL functions like std::copy.
================
Comment at: llvm/include/llvm/Support/LEB128.h:26-27
+template <class ValueT> class LEB128InputIterator {
+ /// Denotes whether this object represents an end() iterator.
+ bool IsEnd;
+
----------------
We could fold this into Optional<uint8_t> CurrByte
================
Comment at: llvm/include/llvm/Support/LEB128.h:32-33
+
+ /// Whether this iterator is outputting a signed or unsigned LEB128 data.
+ bool IsSigned;
+
----------------
It may be worth mentioning that this is needed for types that don't carry signedness like APInt. unsigned/int/etc. wouldn't need it
================
Comment at: llvm/include/llvm/Support/LEB128.h:35-36
+
+ /// Whether there will be more output after the previously outputted byte.
bool More;
+
----------------
This is misleading as the iterator can produce more bytes after this becomes false. I think it's also unnecessary as each time encodeNextByte() shifts Value right by 7 it's bringing in the correct padding bits at the top. We could just keep reading from Value for the padding bytes.
================
Comment at: llvm/include/llvm/Support/LEB128.h:38-42
+ /// The output will be sext-ed/zext-ed to this number of bytes if necessary.
+ unsigned PadTo;
+
+ /// The current number of outputted bytes.
+ unsigned Count;
----------------
Would it make sense to have a RemainingBytes that counts down to zero? Is there a reason to keep PadTo and Count separate?
================
Comment at: llvm/include/llvm/Support/LEB128.h:50
+ CurrByte = Value & 0x7f;
+ if (IsSigned) {
+ int64_t SValue = static_cast<int64_t>(Value);
----------------
Is testing this on each byte measurably slower? It seems unlikely but this is called a lot and I notice that the previous code didn't do it. If it does, it would be good if we can have the template instantiation pick one side or the other
================
Comment at: llvm/include/llvm/Support/LEB128.h:121-136
+ // Pad with 0s or 1s depending on whether we want to zext or sext.
+ uint8_t PadValue;
+ if (IsSigned && static_cast<int64_t>(Value) < 0)
+ PadValue = 0x7f;
+ else
+ PadValue = 0x00;
+
----------------
I think this belongs inside encodeNextByte(). This operator should essentially be something like:
assert(CurrByte.hasValue() && "operator++() called on past-the-end LEB128InputIterator");
CurrByte = encodeNextByte();
return this;
where CurrByte is Optional<uint8_t>. As noted in another comment, I think special-casing the padding bytes isn't really needed.
================
Comment at: llvm/include/llvm/Support/LEB128.h:155-157
+ // Both are end-of-input iterators, so they compare the same.
+ if (IsEnd && Other.IsEnd)
+ return true;
----------------
I see no harm in allowing two past-the-end iterators from different sequences to be equal but I wonder if it's necessary/useful. With the ULEBifier object it could give you an end iterator and then this operator would be a plain equality comparison.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D78796/new/
https://reviews.llvm.org/D78796
More information about the llvm-commits
mailing list