<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 13, 2014 at 1:59 PM, Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>

> On 2014-Nov-13, at 13:54, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>> wrote:<br>

><br>

><br>

><br>

> On Thu, Nov 13, 2014 at 1:40 PM, Duncan P. N. Exon Smith <<a href="mailto:dexonsmith@apple.com">dexonsmith@apple.com</a>> wrote:<br>

><br>

> > On 2014-Nov-13, at 13:36, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>> wrote:<br>

> ><br>

> ><br>

> ><br>

> > On Thu, Nov 13, 2014 at 1:33 PM, Duncan P. N. Exon Smith <<a href="mailto:dexonsmith@apple.com">dexonsmith@apple.com</a>> wrote:<br>

> ><br>

> > > On 2014-Nov-13, at 13:31, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>> wrote:<br>

> > ><br>

> > ><br>

> > ><br>

> > > On Thu, Nov 13, 2014 at 12:59 PM, Duncan P. N. Exon Smith <<a href="mailto:dexonsmith@apple.com">dexonsmith@apple.com</a>> wrote:<br>

> > ><br>

> > > > On 2014-Nov-13, at 10:35, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>> wrote:<br>

> > > ><br>

> > > ><br>

> > > ><br>

> > > > On Thu, Nov 13, 2014 at 4:39 AM, Duncan P. N. Exon Smith <<a href="mailto:dexonsmith@apple.com">dexonsmith@apple.com</a>> wrote:<br>

> > > ><br>

> > > > > On 2014 Nov 12, at 16:55, Keno Fischer <<a href="mailto:kfischer@college.harvard.edu">kfischer@college.harvard.edu</a>> wrote:<br>

> > > > ><br>

> > > > > Hi dexonsmith,<br>

> > > > ><br>

> > > > > In some instances (such as in BitCodeWriter - see <a href="http://reviews.llvm.org/D6184" target="_blank">http://reviews.llvm.org/D6184</a>),<br>

> > > > > it is useful to get an unsiged version of the data pointer instead. This adds an<br>

> > > > > appropriate API to be used in BitCodeWriter in a follow up commit.<br>

> > > > ><br>

> > > > > <a href="http://reviews.llvm.org/D6241" target="_blank">http://reviews.llvm.org/D6241</a><br>

> > > > ><br>

> > > > > Files:<br>

> > > > >  include/llvm/ADT/StringRef.h<br>

> > > > >  include/llvm/IR/Metadata.h<br>

> > > > > <D6241.16125.patch><br>

> > > ><br>

> > > > This LGTM.  I have minor nitpicks below that you can take or leave.<br>

> > > ><br>

> > > > @David, does this API make sense to you given Keno's use case [1]?<br>

> > > ><br>

> > > > While it matches that use case quite well (needing iterator begin/end) there are many other cases where we play the same game - getting uint8_t rather than just unsigned char:<br>

> > > ><br>

> > > > lib/Support/DataExtractor.cpp:  return getU<uint8_t>(offset_ptr, this, IsLittleEndian, Data.data());<br>

> > > > lib/Support/MD5.cpp:  const uint8_t *Ptr = Data.data();<br>

> > > > lib/Support/MD5.cpp:  ArrayRef<uint8_t> SVal((const uint8_t *)Str.data(), Str.size());<br>

> > > > lib/IR/Constants.cpp:    const uint8_t *Data = reinterpret_cast<const uint8_t *>(Str.data());<br>

> > > > lib/ProfileData/CoverageMappingReader.cpp:  Result = decodeULEB128(reinterpret_cast<const uint8_t *>(Data.data()), &N);<br>

> > > > lib/ProfileData/CoverageMappingReader.cpp:      decodeULEB128(reinterpret_cast<const uint8_t *>(Data.data()), &N);<br>

> > > > lib/ProfileData/CoverageMappingReader.cpp:      decodeULEB128(reinterpret_cast<const uint8_t *>(Data.data()), &N);<br>

> > > > lib/ExecutionEngine/RuntimeDyld/RuntimeDyldChecker.cpp:        reinterpret_cast<const uint8_t *>(SectionMem.data()),<br>

> > > > lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp:  uint8_t *ArgDataPtr = ArgData.data();<br>

> > > > lib/DebugInfo/DWARFFormValue.cpp:      Value.data = reinterpret_cast<const uint8_t *>(str.data());<br>

> > > > include/llvm/Support/FileOutputBuffer.h:    return (uint8_t*)Region->data();<br>

> > > > include/llvm/Support/FileOutputBuffer.h:    return (uint8_t*)Region->data() + Region->size();<br>

> > > > include/llvm/Object/ELF.h:    return reinterpret_cast<const uint8_t *>(Buf.data());<br>

> > > > include/llvm/MC/YAML.h:      : Data(reinterpret_cast<const uint8_t *>(Data.data()), Data.size()),<br>

> > > > tools/llvm-objdump/llvm-objdump.cpp:    ArrayRef<uint8_t> Bytes(reinterpret_cast<const uint8_t *>(BytesStr.data()),<br>

> > > > tools/llvm-objdump/MachODump.cpp:    ArrayRef<uint8_t> Bytes(reinterpret_cast<const uint8_t *>(BytesStr.data()),<br>

> > > > tools/llvm-mc/Disassembler.cpp:  ArrayRef<uint8_t> Data(Bytes.first.data(), Bytes.first.size());<br>

> > > > tools/llvm-readobj/StreamWriter.h:    auto V = makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),<br>

> > > > tools/llvm-readobj/StreamWriter.h:    auto V = makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),<br>

> > > > tools/llvm-readobj/StreamWriter.h:    auto V = makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),<br>

> > > > tools/llvm-readobj/StreamWriter.h:    auto V = makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),<br>

> > > ><br>

> > > > In several of these cases we're interested in getting an ArrayRef<uint8_t> from the StringRef - so maybe that's a better primitive operation (it's annoying when interoperating with containers, I'll grant you - since you need to call it twice, or stash a local variable - but one day... we'll have range algorithms for this sort of thing - in the mean time we could have utility functions like those in this patch, but I think the primary operation should probably be "give me a raw (uint8_t) byte array for this StringRef")<br>

> > > ><br>

> > ><br>

> > > Sure.  Do you have a name in mind?  `uref()`?<br>

> > ><br>

> > ><br>

> > > If we're going with a C++-esque name: "ArrayRef<uint8_t> bytes()" ?<br>

> > ><br>

> ><br>

> > `bytes()` sounds like `char` to me.  `ubytes()`?<br>

> ><br>

> > Really? It seems like most places that want bytes deal with unsigned chars, and the existence of this method on a StringRef would imply that it's not just giving identical begin/end as the StringRef's own begin/end, I would imagine...<br>

> ><br>

> > But if you particularly prefer ubytes, sure.<br>

> ><br>

> > Then if someone needs a `signed char` they can add `sbytes()`.<br>

> ><br>

> > sbytes is just a range over char which StringRef already is,<br>

><br>

> `char`, `unsigned char`, and `signed char` are three distinct types.<br>

> If someone depends on have signed chars, they need to cast to<br>

> `signed char`.<br>

><br>

> It's implementation-defined whether `char` is signed, but it's a<br>

> distinct type from the other two in either case.<br>

><br>

> Thanks for the refresher/reminder - I was a bit vague/incorrect there.<br>

><br>

><br>

> > I'm not sure we'd ever need this function (I think it'd be reasonable to have some conversion from StringRef to ArrayRef<char> if we ever need it - though I suspect we won't).<br>

><br>

> I would want to call the `ArrayRef<char>` version `bytes()`.<br>

><br>

> And here.<br>

><br>

> s/char/signed char/ up there if it helps.<br>

><br>

> I'll try to resummarize: I don't think anyone ever really wants signed char in this context, either they don't care (and StringRef/ArrayRef<char> is sufficient - and /maybe/ one day we'll want to ensure an easy conversion between the two, but I don't think I've seen a use case yet & have some doubts about one actually coming into existence) or they want unsigned, which is why I think "bytes()" is fairly unambiguous - the only time you want anything other than the range of char is when you want a range of (unsigned) bytes.<br>

<br>

</div></div>Okay, that's convincing enough for me.<br>

<br>

@Keno, if you add an `ArrayRef<uint8_t> bytes() const` accessor and<br>

define `ubegin()` and `uend()` in terms of it, </blockquote><div><br>For consistency with other range-y accessors, this might be bytes_begin/bytes_end - or the caller can just use bytes().begin(), bytes().end().<br><br>But I don't object to ubegin/uend either, if that's what you guys prefer.<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think you're good<br>

to commit.</blockquote></div><br></div></div>