[PATCH] D126254: Add support for decoding base64.

Greg Clayton via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 7 14:04:36 PDT 2022


clayborg added a comment.

In D126254#3549946 <https://reviews.llvm.org/D126254#3549946>, @serge-sans-paille wrote:

> Sorry for the back and forth: I find the logic difficult to follow. What would you think of something along these lines (untested) https://godbolt.org/z/vh356fhaP

I found that makes the code a bit harder to read and there is a few dups of the "Invalid Base64 character" error producing code sites.



================
Comment at: llvm/include/llvm/Support/Base64.h:56
 
+template <class OutputBytes>
+llvm::Error decodeBase64(llvm::StringRef Input, OutputBytes &Output) {
----------------
aprantl wrote:
> Why the template? Would something like `SmallVectorImpl<uint8_t>` work here?
Then any collection class can be used, including SmallVectorImpl<uint8_t>. 


================
Comment at: llvm/include/llvm/Support/Base64.h:57
+template <class OutputBytes>
+llvm::Error decodeBase64(llvm::StringRef Input, OutputBytes &Output) {
+  // Invalid table value with short name to fit in the table init below. The
----------------
aprantl wrote:
> I think an `ErrorOr<OutputBytes>` or `Optional<OutputBytes>` would be more idiomatic here?
I can see if that would work. I couldn't remember if you can specialized on return value...


================
Comment at: llvm/include/llvm/Support/Base64.h:69
+      52,  53,  54,  55,  56,  57,  58,  59,  // 01234567
+      60,  61,  Inv, Inv, Inv, 0,   Inv, Inv, // 89...=..
+      Inv, 0,   1,   2,   3,   4,   5,   6,   // .ABCDEFG
----------------
serge-sans-paille wrote:
> clayborg wrote:
> > serge-sans-paille wrote:
> > > Would that make sens to set `'='` as `Inv` here? That would prevent you from the initial find
> > We can't, we need it to decode to zero so we can always decode 3 bytes worth of output. We then drop any extra zero bytes based on "NumBytesToStrip" at the end. It also keeps the logic much cleaner in the main decode loop since we don't have to check.
> I understand your point. Doing two pass over the sequence just for the sake of error checking bothers me a bit, I'd rather have a different handler for the tail. Do you have any idea of the length of a typical sequence? 
I fixed the loop to check for equals in the main loop and got rid of the two loops through the input string.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126254/new/

https://reviews.llvm.org/D126254



More information about the llvm-commits mailing list