[cfe-commits] [patch] Fix fixme about duplicated utf conversion code in codegen and lex

Nico Weber thakis at chromium.org
Fri Jun 29 09:56:32 PDT 2012


All done (except for the static_cast, since that line also converts
signedness).

Any comments on the new file name / location from anyone?


On Wed, Jun 27, 2012 at 7:06 AM, Hans Wennborg <hans at chromium.org> wrote:

> On Mon, Jun 25, 2012 at 4:54 AM, Nico Weber <thakis at chromium.org> wrote:
> > Hi,
> >
> > the attached patch moves the duplicated code in CGExpr.cpp and
> > LiteralSupport.cpp into a new file Basic/ConvertUTFWrapper.h/cpp.
> > Please let me know what you think.
> >
> > Nico
>
> Looks reasonable as far as I can tell. Just some nits:
>
> > +#ifndef LLVM_CLANG_BASIC_CONVERT_UTF_WRAPPER_H
> > +#define LLVM_CLANG_BASIC_CONVERT_UTF_WRAPPER_H
> > +
> > +#include "llvm/ADT/StringRef.h"
> > +
> > +namespace clang {
> > +
> > +/// Convert an UTF8 StringRef to UTF8, UTF16, or UTF32 depending on
> > +/// WideCharWidth. On success, ResultPtr will point one after the end
> of the
> > +/// copied string.
> > +/// \returns true on success.
> > +bool ConvertUTF8toWide(unsigned WideCharWidth, llvm::StringRef Source,
> > +                       char*& ResultPtr);
>
> Should the *& be on ResultPtr here? (and in the .cpp file)
> Should the comment say that it's copying into ResultPtr?
>
> > Index: lib/Basic/ConvertUTFWrapper.cpp
> > ===================================================================
> > --- lib/Basic/ConvertUTFWrapper.cpp   (revision 0)
> > +++ lib/Basic/ConvertUTFWrapper.cpp   (revision 0)
> > @@ -0,0 +1,55 @@
> > +//===-- ConvertUTFWrapper.cpp - Wrap ConvertUTF.h with clang data types
> -----===
> > +//
> > +//                     The LLVM Compiler Infrastructure
> > +//
> > +// This file is distributed under the University of Illinois Open Source
> > +// License. See LICENSE.TXT for details.
> > +//
> >
> +//===----------------------------------------------------------------------===//
> > +
> > +#include "clang/Basic/ConvertUTFWrapper.h"
> > +#include "clang/Basic/ConvertUTF.h"
> > +#include "clang/Basic/LLVM.h"
> > +
> > +namespace clang {
> > +
> > +bool ConvertUTF8toWide(unsigned WideCharWidth, llvm::StringRef Source,
> > +                       char*& ResultPtr) {
> > +  assert(WideCharWidth==1 || WideCharWidth==2 || WideCharWidth==4);
> > +  ConversionResult result = conversionOK;
> > +  // Copy the character span over.
> > +  if (WideCharWidth == 1) {
> > +    if (!isLegalUTF8String(reinterpret_cast<const
> UTF8*>(&*Source.begin()),
> > +                           reinterpret_cast<const
> UTF8*>(&*Source.end())))
>
> The &* isn't really necessary for Source.begin() / end(). Do we
> normally do this for StringRefs? (I see it wasn't there in
> lib/Lex/LiteralSupport.cpp)
>
> > +      result = sourceIllegal;
> > +    memcpy(ResultPtr, Source.data(), Source.size());
> > +    ResultPtr += Source.size();
> > +  } else if (WideCharWidth == 2) {
> > +    UTF8 const *sourceStart = (UTF8 const *)Source.data();
>
> This is equivalent to "const UTF8 *sourceStart", right? At least I
> find that easier to read. And maybe static_cast it?
>
> > +    // FIXME: Make the type of the result buffer correct instead of
> > +    // using reinterpret_cast.
> > +    UTF16 *targetStart = reinterpret_cast<UTF16*>(ResultPtr);
> > +    ConversionFlags flags = strictConversion;
> > +    result = ConvertUTF8toUTF16(
> > +      &sourceStart,sourceStart + Source.size(),
>
> Nit: space between &targetStart and targetStart?
>
> > +        &targetStart,targetStart + 2*Source.size(),flags);
>
> And some more spaces here too..
>
> > +    if (result==conversionOK)
>
> And here :)
>
> > +      ResultPtr = reinterpret_cast<char*>(targetStart);
> > +  } else if (WideCharWidth == 4) {
> > +    UTF8 const *sourceStart = (UTF8 const *)Source.data();
>
> Ditto about "const UTF8* sourceStart" etc.
>
> > +    // FIXME: Make the type of the result buffer correct instead of
> > +    // using reinterpret_cast.
> > +    UTF32 *targetStart = reinterpret_cast<UTF32*>(ResultPtr);
> > +    ConversionFlags flags = strictConversion;
> > +    result = ConvertUTF8toUTF32(
>
> > +        &sourceStart,sourceStart + Source.size(),
> > +        &targetStart,targetStart + 4*Source.size(),flags);
> > +    if (result==conversionOK)
>
> Ditto about spaces.
>
> Thanks,
> Hans
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120629/fed01540/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-convert-fixme.patch
Type: application/octet-stream
Size: 9259 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120629/fed01540/attachment.obj>


More information about the cfe-commits mailing list