[PATCH] Add new ubegin/uend API to StringRef/MDString

David Blaikie dblaikie at gmail.com
Thu Nov 13 10:35:52 PST 2014


On Thu, Nov 13, 2014 at 4:39 AM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2014 Nov 12, at 16:55, Keno Fischer <kfischer at college.harvard.edu>
> wrote:
> >
> > Hi dexonsmith,
> >
> > In some instances (such as in BitCodeWriter - see
> http://reviews.llvm.org/D6184),
> > it is useful to get an unsiged version of the data pointer instead. This
> adds an
> > appropriate API to be used in BitCodeWriter in a follow up commit.
> >
> > http://reviews.llvm.org/D6241
> >
> > Files:
> >  include/llvm/ADT/StringRef.h
> >  include/llvm/IR/Metadata.h
> > <D6241.16125.patch>
>
> This LGTM.  I have minor nitpicks below that you can take or leave.
>
> @David, does this API make sense to you given Keno's use case [1]?
>

While it matches that use case quite well (needing iterator begin/end)
there are many other cases where we play the same game - getting uint8_t
rather than just unsigned char:

lib/Support/DataExtractor.cpp:  return getU<uint8_t>(offset_ptr, this,
IsLittleEndian, Data.data());
lib/Support/MD5.cpp:  const uint8_t *Ptr = Data.data();
lib/Support/MD5.cpp:  ArrayRef<uint8_t> SVal((const uint8_t *)Str.data(),
Str.size());
lib/IR/Constants.cpp:    const uint8_t *Data = reinterpret_cast<const
uint8_t *>(Str.data());
lib/ProfileData/CoverageMappingReader.cpp:  Result =
decodeULEB128(reinterpret_cast<const uint8_t *>(Data.data()), &N);
lib/ProfileData/CoverageMappingReader.cpp:
 decodeULEB128(reinterpret_cast<const uint8_t *>(Data.data()), &N);
lib/ProfileData/CoverageMappingReader.cpp:
 decodeULEB128(reinterpret_cast<const uint8_t *>(Data.data()), &N);
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldChecker.cpp:
 reinterpret_cast<const uint8_t *>(SectionMem.data()),
lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp:  uint8_t *ArgDataPtr
= ArgData.data();
lib/DebugInfo/DWARFFormValue.cpp:      Value.data = reinterpret_cast<const
uint8_t *>(str.data());
include/llvm/Support/FileOutputBuffer.h:    return (uint8_t*)Region->data();
include/llvm/Support/FileOutputBuffer.h:    return (uint8_t*)Region->data()
+ Region->size();
include/llvm/Object/ELF.h:    return reinterpret_cast<const uint8_t
*>(Buf.data());
include/llvm/MC/YAML.h:      : Data(reinterpret_cast<const uint8_t
*>(Data.data()), Data.size()),
tools/llvm-objdump/llvm-objdump.cpp:    ArrayRef<uint8_t>
Bytes(reinterpret_cast<const uint8_t *>(BytesStr.data()),
tools/llvm-objdump/MachODump.cpp:    ArrayRef<uint8_t>
Bytes(reinterpret_cast<const uint8_t *>(BytesStr.data()),
tools/llvm-mc/Disassembler.cpp:  ArrayRef<uint8_t> Data(Bytes.first.data(),
Bytes.first.size());
tools/llvm-readobj/StreamWriter.h:    auto V =
makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),
tools/llvm-readobj/StreamWriter.h:    auto V =
makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),
tools/llvm-readobj/StreamWriter.h:    auto V =
makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),
tools/llvm-readobj/StreamWriter.h:    auto V =
makeArrayRef(reinterpret_cast<const uint8_t*>(Value.data()),

In several of these cases we're interested in getting an ArrayRef<uint8_t>
from the StringRef - so maybe that's a better primitive operation (it's
annoying when interoperating with containers, I'll grant you - since you
need to call it twice, or stash a local variable - but one day... we'll
have range algorithms for this sort of thing - in the mean time we could
have utility functions like those in this patch, but I think the primary
operation should probably be "give me a raw (uint8_t) byte array for this
StringRef")

Also see Aaron's recent -Wcast-qual fixing patches that touched several
cases like this:
http://llvm.org/viewvc/llvm-project?view=revision&revision=221782
http://llvm.org/viewvc/llvm-project?view=revision&revision=221781


>
> [1]:
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/243658.html
>
> > Index: include/llvm/ADT/StringRef.h
> > ===================================================================
> > --- include/llvm/ADT/StringRef.h
> > +++ include/llvm/ADT/StringRef.h
> > @@ -40,6 +40,7 @@
> >    class StringRef {
> >    public:
> >      typedef const char *iterator;
> > +    typedef unsigned const char *uiterator;
>
> This spelling is strange to me; I'd expect:
>
>     typedef const unsigned char *uiterator;
>
> >      typedef const char *const_iterator;
> >      static const size_t npos = ~size_t(0);
> >      typedef size_t size_type;
> > @@ -88,8 +89,12 @@
> >      /// @{
> >
> >      iterator begin() const { return Data; }
> > +    uiterator ubegin() const { return
> reinterpret_cast<uiterator>(Data); }
>
> I have a slight preference for:
>
>     uiterator ubegin() const { return
> reinterpret_cast<uiterator>(begin()); }
>
> But mainly just to match my change to `uend()` below.
>
> >
> >      iterator end() const { return Data + Length; }
> > +    uiterator uend() const {
> > +      return reinterpret_cast<uiterator>(Data) + Length;
> > +    }
>
> I think this should be:
>
>     uiterator uend() const { return reinterpret_cast<uiterator>(end()); }
>
> >
> >      /// @}
> >      /// @name String Operations
> > Index: include/llvm/IR/Metadata.h
> > ===================================================================
> > --- include/llvm/IR/Metadata.h
> > +++ include/llvm/IR/Metadata.h
> > @@ -55,12 +55,15 @@
> >    unsigned getLength() const { return (unsigned)getName().size(); }
> >
> >    typedef StringRef::iterator iterator;
> > +  typedef StringRef::uiterator uiterator;
> >
> >    /// \brief Pointer to the first byte of the string.
> >    iterator begin() const { return getName().begin(); }
> > +  uiterator ubegin() const { return getName().ubegin(); }
> >
> >    /// \brief Pointer to one byte past the end of the string.
> >    iterator end() const { return getName().end(); }
> > +  uiterator uend() const { return getName().uend(); }
> >
> >    /// \brief Methods for support type inquiry through isa, cast, and
> dyn_cast.
> >    static bool classof(const Value *V) {
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141113/f26202b7/attachment.html>


More information about the llvm-commits mailing list