r173924 - Move UTF conversion routines from clang/lib/Basic to llvm/lib/Support

Wed Jan 30 04:25:05 PST 2013

Hi Dmitri,

I believe this has broken my Windows build:

CMake Error: Unknown Target referenced :
ClangCommentHTMLNamedCharacterReferences
CMake Error: Target: clangAST depends on unknown target:
ClangCommentHTMLNamedCharacterReferences

Can you please take a look?

2013/1/30 Dmitri Gribenko <gribozavr at gmail.com>:
> Author: gribozavr
> Date: Wed Jan 30 06:06:08 2013
> New Revision: 173924
>
> URL: http://llvm.org/viewvc/llvm-project?rev=173924&view=rev
> Log:
> Move UTF conversion routines from clang/lib/Basic to llvm/lib/Support
>
> This is required to use them in TableGen.
>
> Removed:
>     cfe/trunk/include/clang/Basic/ConvertUTF.h
>     cfe/trunk/lib/Basic/ConvertUTF.c
>     cfe/trunk/lib/Basic/ConvertUTFWrapper.cpp
> Modified:
>     cfe/trunk/lib/AST/CMakeLists.txt
>     cfe/trunk/lib/AST/CommentLexer.cpp
>     cfe/trunk/lib/Basic/CMakeLists.txt
>     cfe/trunk/lib/CodeGen/CGExpr.cpp
>     cfe/trunk/lib/CodeGen/CodeGenModule.cpp
>     cfe/trunk/lib/Frontend/TextDiagnostic.cpp
>     cfe/trunk/lib/Lex/Lexer.cpp
>     cfe/trunk/lib/Lex/LiteralSupport.cpp
>     cfe/trunk/lib/Lex/Preprocessor.cpp
>     cfe/trunk/lib/Sema/SemaChecking.cpp
>
> Removed: cfe/trunk/include/clang/Basic/ConvertUTF.h
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/ConvertUTF.h?rev=173923&view=auto
> ==============================================================================
> --- cfe/trunk/include/clang/Basic/ConvertUTF.h (original)
> +++ cfe/trunk/include/clang/Basic/ConvertUTF.h (removed)
> @@ -1,230 +0,0 @@
> -/*===--- ConvertUTF.h - Universal Character Names conversions ---------------===
> - *
> - *                     The LLVM Compiler Infrastructure
> - *
> - * This file is distributed under the University of Illinois Open Source
> - * License. See LICENSE.TXT for details.
> - *
> - *==------------------------------------------------------------------------==*/
> -/*
> - * Copyright 2001-2004 Unicode, Inc.
> - *
> - * Disclaimer
> - *
> - * This source code is provided as is by Unicode, Inc. No claims are
> - * made as to fitness for any particular purpose. No warranties of any
> - * kind are expressed or implied. The recipient agrees to determine
> - * applicability of information provided. If this file has been
> - * purchased on magnetic or optical media from Unicode, Inc., the
> - * sole remedy for any claim will be exchange of defective media
> - * within 90 days of receipt.
> - *
> - * Limitations on Rights to Redistribute This Code
> - *
> - * Unicode, Inc. hereby grants the right to freely use the information
> - * supplied in this file in the creation of products supporting the
> - * Unicode Standard, and to make copies of this file in any form
> - * for internal or external distribution as long as this notice
> - * remains attached.
> - */
> -
> -/* ---------------------------------------------------------------------
> -
> -    Conversions between UTF32, UTF-16, and UTF-8.  Header file.
> -
> -    Several funtions are included here, forming a complete set of
> -    conversions between the three formats.  UTF-7 is not included
> -    here, but is handled in a separate source file.
> -
> -    Each of these routines takes pointers to input buffers and output
> -    buffers.  The input buffers are const.
> -
> -    Each routine converts the text between *sourceStart and sourceEnd,
> -    putting the result into the buffer between *targetStart and
> -    targetEnd. Note: the end pointers are *after* the last item: e.g.
> -    *(sourceEnd - 1) is the last item.
> -
> -    The return result indicates whether the conversion was successful,
> -    and if not, whether the problem was in the source or target buffers.
> -    (Only the first encountered problem is indicated.)
> -
> -    After the conversion, *sourceStart and *targetStart are both
> -    updated to point to the end of last text successfully converted in
> -    the respective buffers.
> -
> -    Input parameters:
> -        sourceStart - pointer to a pointer to the source buffer.
> -                The contents of this are modified on return so that
> -                it points at the next thing to be converted.
> -        targetStart - similarly, pointer to pointer to the target buffer.
> -        sourceEnd, targetEnd - respectively pointers to the ends of the
> -                two buffers, for overflow checking only.
> -
> -    These conversion functions take a ConversionFlags argument. When this
> -    flag is set to strict, both irregular sequences and isolated surrogates
> -    will cause an error.  When the flag is set to lenient, both irregular
> -    sequences and isolated surrogates are converted.
> -
> -    Whether the flag is strict or lenient, all illegal sequences will cause
> -    an error return. This includes sequences such as: <F4 90 80 80>, <C0 80>,
> -    or <A0> in UTF-8, and values above 0x10FFFF in UTF-32. Conformant code
> -    must check for illegal sequences.
> -
> -    When the flag is set to lenient, characters over 0x10FFFF are converted
> -    to the replacement character; otherwise (when the flag is set to strict)
> -    they constitute an error.
> -
> -    Output parameters:
> -        The value "sourceIllegal" is returned from some routines if the input
> -        sequence is malformed.  When "sourceIllegal" is returned, the source
> -        value will point to the illegal value that caused the problem. E.g.,
> -        in UTF-8 when a sequence is malformed, it points to the start of the
> -        malformed sequence.
> -
> -    Author: Mark E. Davis, 1994.
> -    Rev History: Rick McGowan, fixes & updates May 2001.
> -         Fixes & updates, Sept 2001.
> -
> ------------------------------------------------------------------------- */
> -
> -#ifndef CLANG_BASIC_CONVERTUTF_H
> -#define CLANG_BASIC_CONVERTUTF_H
> -
> -/* ---------------------------------------------------------------------
> -    The following 4 definitions are compiler-specific.
> -    The C standard does not guarantee that wchar_t has at least
> -    16 bits, so wchar_t is no less portable than unsigned short!
> -    All should be unsigned values to avoid sign extension during
> -    bit mask & shift operations.
> ------------------------------------------------------------------------- */
> -
> -typedef unsigned int    UTF32;  /* at least 32 bits */
> -typedef unsigned short  UTF16;  /* at least 16 bits */
> -typedef unsigned char   UTF8;   /* typically 8 bits */
> -typedef unsigned char   Boolean; /* 0 or 1 */
> -
> -/* Some fundamental constants */
> -#define UNI_REPLACEMENT_CHAR (UTF32)0x0000FFFD
> -#define UNI_MAX_BMP (UTF32)0x0000FFFF
> -#define UNI_MAX_UTF16 (UTF32)0x0010FFFF
> -#define UNI_MAX_UTF32 (UTF32)0x7FFFFFFF
> -#define UNI_MAX_LEGAL_UTF32 (UTF32)0x0010FFFF
> -
> -#define UNI_MAX_UTF8_BYTES_PER_CODE_POINT 4
> -
> -typedef enum {
> -  conversionOK,           /* conversion successful */
> -  sourceExhausted,        /* partial character in source, but hit end */
> -  targetExhausted,        /* insuff. room in target for conversion */
> -  sourceIllegal           /* source sequence is illegal/malformed */
> -} ConversionResult;
> -
> -typedef enum {
> -  strictConversion = 0,
> -  lenientConversion
> -} ConversionFlags;
> -
> -/* This is for C++ and does no harm in C */
> -#ifdef __cplusplus
> -extern "C" {
> -#endif
> -
> -ConversionResult ConvertUTF8toUTF16 (
> -  const UTF8** sourceStart, const UTF8* sourceEnd,
> -  UTF16** targetStart, UTF16* targetEnd, ConversionFlags flags);
> -
> -ConversionResult ConvertUTF8toUTF32 (
> -  const UTF8** sourceStart, const UTF8* sourceEnd,
> -  UTF32** targetStart, UTF32* targetEnd, ConversionFlags flags);
> -
> -#ifdef CLANG_NEEDS_THESE_ONE_DAY
> -ConversionResult ConvertUTF16toUTF8 (
> -  const UTF16** sourceStart, const UTF16* sourceEnd,
> -  UTF8** targetStart, UTF8* targetEnd, ConversionFlags flags);
> -#endif
> -
> -ConversionResult ConvertUTF32toUTF8 (
> -  const UTF32** sourceStart, const UTF32* sourceEnd,
> -  UTF8** targetStart, UTF8* targetEnd, ConversionFlags flags);
> -
> -ConversionResult ConvertUTF16toUTF32 (
> -  const UTF16** sourceStart, const UTF16* sourceEnd,
> -  UTF32** targetStart, UTF32* targetEnd, ConversionFlags flags);
> -
> -ConversionResult ConvertUTF32toUTF16 (
> -  const UTF32** sourceStart, const UTF32* sourceEnd,
> -  UTF16** targetStart, UTF16* targetEnd, ConversionFlags flags);
> -
> -Boolean isLegalUTF8Sequence(const UTF8 *source, const UTF8 *sourceEnd);
> -
> -Boolean isLegalUTF8String(const UTF8 **source, const UTF8 *sourceEnd);
> -
> -unsigned getNumBytesForUTF8(UTF8 firstByte);
> -
> -#ifdef __cplusplus
> -}
> -
> -/*************************************************************************/
> -/* Below are LLVM-specific wrappers of the functions above. */
> -
> -#include "llvm/ADT/StringRef.h"
> -
> -namespace clang {
> -
> -/**
> - * Convert an UTF8 StringRef to UTF8, UTF16, or UTF32 depending on
> - * WideCharWidth. The converted data is written to ResultPtr, which needs to
> - * point to at least WideCharWidth * (Source.Size() + 1) bytes. On success,
> - * ResultPtr will point one after the end of the copied string. On failure,
> - * ResultPtr will not be changed, and ErrorPtr will be set to the location of
> - * the first character which could not be converted.
> - * \return true on success.
> - */
> -bool ConvertUTF8toWide(unsigned WideCharWidth, llvm::StringRef Source,
> -                       char *&ResultPtr, const UTF8 *&ErrorPtr);
> -
> -/**
> - * Convert an Unicode code point to UTF8 sequence.
> - *
> - * \param Source a Unicode code point.
> - * \param [in,out] ResultPtr pointer to the output buffer, needs to be at least
> - * \c UNI_MAX_UTF8_BYTES_PER_CODE_POINT bytes.  On success \c ResultPtr is
> - * updated one past end of the converted sequence.
> - *
> - * \returns true on success.
> - */
> -bool ConvertCodePointToUTF8(unsigned Source, char *&ResultPtr);
> -
> -/**
> - * Convert the first UTF8 sequence in the given source buffer to a UTF32
> - * code point.
> - *
> - * \param [in,out] source A pointer to the source buffer. If the conversion
> - * succeeds, this pointer will be updated to point to the byte just past the
> - * end of the converted sequence.
> - * \param sourceEnd A pointer just past the end of the source buffer.
> - * \param [out] target The converted code
> - * \param flags Whether the conversion is strict or lenient.
> - *
> - * \returns conversionOK on success
> - *
> - * \sa ConvertUTF8toUTF32
> - */
> -static inline ConversionResult convertUTF8Sequence(const UTF8 **source,
> -                                                   const UTF8 *sourceEnd,
> -                                                   UTF32 *target,
> -                                                   ConversionFlags flags) {
> -  if (*source == sourceEnd)
> -    return sourceExhausted;
> -  unsigned size = getNumBytesForUTF8(**source);
> -  if ((ptrdiff_t)size > sourceEnd - *source)
> -    return sourceExhausted;
> -  return ConvertUTF8toUTF32(source, *source + size, &target, target + 1, flags);
> -}
> -}
> -
> -#endif
> -
> -/* --------------------------------------------------------------------- */
> -
> -#endif
>
> Modified: cfe/trunk/lib/AST/CMakeLists.txt
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/CMakeLists.txt?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/AST/CMakeLists.txt (original)
> +++ cfe/trunk/lib/AST/CMakeLists.txt Wed Jan 30 06:06:08 2013
> @@ -68,6 +68,7 @@ add_dependencies(clangAST
>    ClangCommentNodes
>    ClangCommentHTMLTags
>    ClangCommentHTMLTagsProperties
> +  ClangCommentHTMLNamedCharacterReferences
>    ClangDeclNodes
>    ClangDiagnosticAST
>    ClangDiagnosticComment
>
> Modified: cfe/trunk/lib/AST/CommentLexer.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/CommentLexer.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/AST/CommentLexer.cpp (original)
> +++ cfe/trunk/lib/AST/CommentLexer.cpp Wed Jan 30 06:06:08 2013
> @@ -1,8 +1,8 @@
>  #include "clang/AST/CommentLexer.h"
>  #include "clang/AST/CommentCommandTraits.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "llvm/ADT/StringExtras.h"
>  #include "llvm/ADT/StringSwitch.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/ErrorHandling.h"
>
>  namespace clang {
> @@ -48,7 +48,7 @@ static unsigned getCodePoint(StringRef N
>  StringRef Lexer::helperResolveHTMLHexCharacterReference(unsigned CodePoint) const {
>    char *Resolved = Allocator.Allocate<char>(UNI_MAX_UTF8_BYTES_PER_CODE_POINT);
>    char *ResolvedPtr = Resolved;
> -  if (ConvertCodePointToUTF8(CodePoint, ResolvedPtr))
> +  if (llvm::ConvertCodePointToUTF8(CodePoint, ResolvedPtr))
>      return StringRef(Resolved, ResolvedPtr - Resolved);
>    else
>      return StringRef();
> @@ -223,7 +223,7 @@ StringRef Lexer::resolveHTMLDecimalChara
>
>    char *Resolved = Allocator.Allocate<char>(UNI_MAX_UTF8_BYTES_PER_CODE_POINT);
>    char *ResolvedPtr = Resolved;
> -  if (ConvertCodePointToUTF8(CodePoint, ResolvedPtr))
> +  if (llvm::ConvertCodePointToUTF8(CodePoint, ResolvedPtr))
>      return StringRef(Resolved, ResolvedPtr - Resolved);
>    else
>      return StringRef();
>
> Modified: cfe/trunk/lib/Basic/CMakeLists.txt
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/CMakeLists.txt?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Basic/CMakeLists.txt (original)
> +++ cfe/trunk/lib/Basic/CMakeLists.txt Wed Jan 30 06:06:08 2013
> @@ -2,8 +2,6 @@ set(LLVM_LINK_COMPONENTS mc)
>
>  add_clang_library(clangBasic
>    Builtins.cpp
> -  ConvertUTF.c
> -  ConvertUTFWrapper.cpp
>    Diagnostic.cpp
>    DiagnosticIDs.cpp
>    FileManager.cpp
>
> Removed: cfe/trunk/lib/Basic/ConvertUTF.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/ConvertUTF.c?rev=173923&view=auto
> ==============================================================================
> --- cfe/trunk/lib/Basic/ConvertUTF.c (original)
> +++ cfe/trunk/lib/Basic/ConvertUTF.c (removed)
> @@ -1,571 +0,0 @@
> -/*===--- ConvertUTF.c - Universal Character Names conversions ---------------===
> - *
> - *                     The LLVM Compiler Infrastructure
> - *
> - * This file is distributed under the University of Illinois Open Source
> - * License. See LICENSE.TXT for details.
> - *
> - *===------------------------------------------------------------------------=*/
> -/*
> - * Copyright 2001-2004 Unicode, Inc.
> - *
> - * Disclaimer
> - *
> - * This source code is provided as is by Unicode, Inc. No claims are
> - * made as to fitness for any particular purpose. No warranties of any
> - * kind are expressed or implied. The recipient agrees to determine
> - * applicability of information provided. If this file has been
> - * purchased on magnetic or optical media from Unicode, Inc., the
> - * sole remedy for any claim will be exchange of defective media
> - * within 90 days of receipt.
> - *
> - * Limitations on Rights to Redistribute This Code
> - *
> - * Unicode, Inc. hereby grants the right to freely use the information
> - * supplied in this file in the creation of products supporting the
> - * Unicode Standard, and to make copies of this file in any form
> - * for internal or external distribution as long as this notice
> - * remains attached.
> - */
> -
> -/* ---------------------------------------------------------------------
> -
> -    Conversions between UTF32, UTF-16, and UTF-8. Source code file.
> -    Author: Mark E. Davis, 1994.
> -    Rev History: Rick McGowan, fixes & updates May 2001.
> -    Sept 2001: fixed const & error conditions per
> -        mods suggested by S. Parent & A. Lillich.
> -    June 2002: Tim Dodd added detection and handling of incomplete
> -        source sequences, enhanced error detection, added casts
> -        to eliminate compiler warnings.
> -    July 2003: slight mods to back out aggressive FFFE detection.
> -    Jan 2004: updated switches in from-UTF8 conversions.
> -    Oct 2004: updated to use UNI_MAX_LEGAL_UTF32 in UTF-32 conversions.
> -
> -    See the header file "ConvertUTF.h" for complete documentation.
> -
> ------------------------------------------------------------------------- */
> -
> -
> -#include "clang/Basic/ConvertUTF.h"
> -#ifdef CVTUTF_DEBUG
> -#include <stdio.h>
> -#endif
> -
> -static const int halfShift  = 10; /* used for shifting by 10 bits */
> -
> -static const UTF32 halfBase = 0x0010000UL;
> -static const UTF32 halfMask = 0x3FFUL;
> -
> -#define UNI_SUR_HIGH_START  (UTF32)0xD800
> -#define UNI_SUR_HIGH_END    (UTF32)0xDBFF
> -#define UNI_SUR_LOW_START   (UTF32)0xDC00
> -#define UNI_SUR_LOW_END     (UTF32)0xDFFF
> -#define false      0
> -#define true        1
> -
> -/* --------------------------------------------------------------------- */
> -
> -/*
> - * Index into the table below with the first byte of a UTF-8 sequence to
> - * get the number of trailing bytes that are supposed to follow it.
> - * Note that *legal* UTF-8 values can't have 4 or 5-bytes. The table is
> - * left as-is for anyone who may want to do such conversion, which was
> - * allowed in earlier algorithms.
> - */
> -static const char trailingBytesForUTF8[256] = {
> -    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> -    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> -    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> -    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> -    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> -    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> -    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> -    2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5
> -};
> -
> -/*
> - * Magic values subtracted from a buffer value during UTF8 conversion.
> - * This table contains as many values as there might be trailing bytes
> - * in a UTF-8 sequence.
> - */
> -static const UTF32 offsetsFromUTF8[6] = { 0x00000000UL, 0x00003080UL, 0x000E2080UL,
> -                     0x03C82080UL, 0xFA082080UL, 0x82082080UL };
> -
> -/*
> - * Once the bits are split out into bytes of UTF-8, this is a mask OR-ed
> - * into the first byte, depending on how many bytes follow.  There are
> - * as many entries in this table as there are UTF-8 sequence types.
> - * (I.e., one byte sequence, two byte... etc.). Remember that sequencs
> - * for *legal* UTF-8 will be 4 or fewer bytes total.
> - */
> -static const UTF8 firstByteMark[7] = { 0x00, 0x00, 0xC0, 0xE0, 0xF0, 0xF8, 0xFC };
> -
> -/* --------------------------------------------------------------------- */
> -
> -/* The interface converts a whole buffer to avoid function-call overhead.
> - * Constants have been gathered. Loops & conditionals have been removed as
> - * much as possible for efficiency, in favor of drop-through switches.
> - * (See "Note A" at the bottom of the file for equivalent code.)
> - * If your compiler supports it, the "isLegalUTF8" call can be turned
> - * into an inline function.
> - */
> -
> -
> -/* --------------------------------------------------------------------- */
> -
> -ConversionResult ConvertUTF32toUTF16 (
> -        const UTF32** sourceStart, const UTF32* sourceEnd,
> -        UTF16** targetStart, UTF16* targetEnd, ConversionFlags flags) {
> -    ConversionResult result = conversionOK;
> -    const UTF32* source = *sourceStart;
> -    UTF16* target = *targetStart;
> -    while (source < sourceEnd) {
> -        UTF32 ch;
> -        if (target >= targetEnd) {
> -            result = targetExhausted; break;
> -        }
> -        ch = *source++;
> -        if (ch <= UNI_MAX_BMP) { /* Target is a character <= 0xFFFF */
> -            /* UTF-16 surrogate values are illegal in UTF-32; 0xffff or 0xfffe are both reserved values */
> -            if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_LOW_END) {
> -                if (flags == strictConversion) {
> -                    --source; /* return to the illegal value itself */
> -                    result = sourceIllegal;
> -                    break;
> -                } else {
> -                    *target++ = UNI_REPLACEMENT_CHAR;
> -                }
> -            } else {
> -                *target++ = (UTF16)ch; /* normal case */
> -            }
> -        } else if (ch > UNI_MAX_LEGAL_UTF32) {
> -            if (flags == strictConversion) {
> -                result = sourceIllegal;
> -            } else {
> -                *target++ = UNI_REPLACEMENT_CHAR;
> -            }
> -        } else {
> -            /* target is a character in range 0xFFFF - 0x10FFFF. */
> -            if (target + 1 >= targetEnd) {
> -                --source; /* Back up source pointer! */
> -                result = targetExhausted; break;
> -            }
> -            ch -= halfBase;
> -            *target++ = (UTF16)((ch >> halfShift) + UNI_SUR_HIGH_START);
> -            *target++ = (UTF16)((ch & halfMask) + UNI_SUR_LOW_START);
> -        }
> -    }
> -    *sourceStart = source;
> -    *targetStart = target;
> -    return result;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -ConversionResult ConvertUTF16toUTF32 (
> -        const UTF16** sourceStart, const UTF16* sourceEnd,
> -        UTF32** targetStart, UTF32* targetEnd, ConversionFlags flags) {
> -    ConversionResult result = conversionOK;
> -    const UTF16* source = *sourceStart;
> -    UTF32* target = *targetStart;
> -    UTF32 ch, ch2;
> -    while (source < sourceEnd) {
> -        const UTF16* oldSource = source; /*  In case we have to back up because of target overflow. */
> -        ch = *source++;
> -        /* If we have a surrogate pair, convert to UTF32 first. */
> -        if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_HIGH_END) {
> -            /* If the 16 bits following the high surrogate are in the source buffer... */
> -            if (source < sourceEnd) {
> -                ch2 = *source;
> -                /* If it's a low surrogate, convert to UTF32. */
> -                if (ch2 >= UNI_SUR_LOW_START && ch2 <= UNI_SUR_LOW_END) {
> -                    ch = ((ch - UNI_SUR_HIGH_START) << halfShift)
> -                        + (ch2 - UNI_SUR_LOW_START) + halfBase;
> -                    ++source;
> -                } else if (flags == strictConversion) { /* it's an unpaired high surrogate */
> -                    --source; /* return to the illegal value itself */
> -                    result = sourceIllegal;
> -                    break;
> -                }
> -            } else { /* We don't have the 16 bits following the high surrogate. */
> -                --source; /* return to the high surrogate */
> -                result = sourceExhausted;
> -                break;
> -            }
> -        } else if (flags == strictConversion) {
> -            /* UTF-16 surrogate values are illegal in UTF-32 */
> -            if (ch >= UNI_SUR_LOW_START && ch <= UNI_SUR_LOW_END) {
> -                --source; /* return to the illegal value itself */
> -                result = sourceIllegal;
> -                break;
> -            }
> -        }
> -        if (target >= targetEnd) {
> -            source = oldSource; /* Back up source pointer! */
> -            result = targetExhausted; break;
> -        }
> -        *target++ = ch;
> -    }
> -    *sourceStart = source;
> -    *targetStart = target;
> -#ifdef CVTUTF_DEBUG
> -if (result == sourceIllegal) {
> -    fprintf(stderr, "ConvertUTF16toUTF32 illegal seq 0x%04x,%04x\n", ch, ch2);
> -    fflush(stderr);
> -}
> -#endif
> -    return result;
> -}
> -ConversionResult ConvertUTF16toUTF8 (
> -        const UTF16** sourceStart, const UTF16* sourceEnd,
> -        UTF8** targetStart, UTF8* targetEnd, ConversionFlags flags) {
> -    ConversionResult result = conversionOK;
> -    const UTF16* source = *sourceStart;
> -    UTF8* target = *targetStart;
> -    while (source < sourceEnd) {
> -        UTF32 ch;
> -        unsigned short bytesToWrite = 0;
> -        const UTF32 byteMask = 0xBF;
> -        const UTF32 byteMark = 0x80;
> -        const UTF16* oldSource = source; /* In case we have to back up because of target overflow. */
> -        ch = *source++;
> -        /* If we have a surrogate pair, convert to UTF32 first. */
> -        if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_HIGH_END) {
> -            /* If the 16 bits following the high surrogate are in the source buffer... */
> -            if (source < sourceEnd) {
> -                UTF32 ch2 = *source;
> -                /* If it's a low surrogate, convert to UTF32. */
> -                if (ch2 >= UNI_SUR_LOW_START && ch2 <= UNI_SUR_LOW_END) {
> -                    ch = ((ch - UNI_SUR_HIGH_START) << halfShift)
> -                        + (ch2 - UNI_SUR_LOW_START) + halfBase;
> -                    ++source;
> -                } else if (flags == strictConversion) { /* it's an unpaired high surrogate */
> -                    --source; /* return to the illegal value itself */
> -                    result = sourceIllegal;
> -                    break;
> -                }
> -            } else { /* We don't have the 16 bits following the high surrogate. */
> -                --source; /* return to the high surrogate */
> -                result = sourceExhausted;
> -                break;
> -            }
> -        } else if (flags == strictConversion) {
> -            /* UTF-16 surrogate values are illegal in UTF-32 */
> -            if (ch >= UNI_SUR_LOW_START && ch <= UNI_SUR_LOW_END) {
> -                --source; /* return to the illegal value itself */
> -                result = sourceIllegal;
> -                break;
> -            }
> -        }
> -        /* Figure out how many bytes the result will require */
> -        if (ch < (UTF32)0x80) {      bytesToWrite = 1;
> -        } else if (ch < (UTF32)0x800) {     bytesToWrite = 2;
> -        } else if (ch < (UTF32)0x10000) {   bytesToWrite = 3;
> -        } else if (ch < (UTF32)0x110000) {  bytesToWrite = 4;
> -        } else {                            bytesToWrite = 3;
> -                                            ch = UNI_REPLACEMENT_CHAR;
> -        }
> -
> -        target += bytesToWrite;
> -        if (target > targetEnd) {
> -            source = oldSource; /* Back up source pointer! */
> -            target -= bytesToWrite; result = targetExhausted; break;
> -        }
> -        switch (bytesToWrite) { /* note: everything falls through. */
> -            case 4: *--target = (UTF8)((ch | byteMark) & byteMask); ch >>= 6;
> -            case 3: *--target = (UTF8)((ch | byteMark) & byteMask); ch >>= 6;
> -            case 2: *--target = (UTF8)((ch | byteMark) & byteMask); ch >>= 6;
> -            case 1: *--target =  (UTF8)(ch | firstByteMark[bytesToWrite]);
> -        }
> -        target += bytesToWrite;
> -    }
> -    *sourceStart = source;
> -    *targetStart = target;
> -    return result;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -ConversionResult ConvertUTF32toUTF8 (
> -        const UTF32** sourceStart, const UTF32* sourceEnd,
> -        UTF8** targetStart, UTF8* targetEnd, ConversionFlags flags) {
> -    ConversionResult result = conversionOK;
> -    const UTF32* source = *sourceStart;
> -    UTF8* target = *targetStart;
> -    while (source < sourceEnd) {
> -        UTF32 ch;
> -        unsigned short bytesToWrite = 0;
> -        const UTF32 byteMask = 0xBF;
> -        const UTF32 byteMark = 0x80;
> -        ch = *source++;
> -        if (flags == strictConversion ) {
> -            /* UTF-16 surrogate values are illegal in UTF-32 */
> -            if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_LOW_END) {
> -                --source; /* return to the illegal value itself */
> -                result = sourceIllegal;
> -                break;
> -            }
> -        }
> -        /*
> -         * Figure out how many bytes the result will require. Turn any
> -         * illegally large UTF32 things (> Plane 17) into replacement chars.
> -         */
> -        if (ch < (UTF32)0x80) {      bytesToWrite = 1;
> -        } else if (ch < (UTF32)0x800) {     bytesToWrite = 2;
> -        } else if (ch < (UTF32)0x10000) {   bytesToWrite = 3;
> -        } else if (ch <= UNI_MAX_LEGAL_UTF32) {  bytesToWrite = 4;
> -        } else {                            bytesToWrite = 3;
> -                                            ch = UNI_REPLACEMENT_CHAR;
> -                                            result = sourceIllegal;
> -        }
> -
> -        target += bytesToWrite;
> -        if (target > targetEnd) {
> -            --source; /* Back up source pointer! */
> -            target -= bytesToWrite; result = targetExhausted; break;
> -        }
> -        switch (bytesToWrite) { /* note: everything falls through. */
> -            case 4: *--target = (UTF8)((ch | byteMark) & byteMask); ch >>= 6;
> -            case 3: *--target = (UTF8)((ch | byteMark) & byteMask); ch >>= 6;
> -            case 2: *--target = (UTF8)((ch | byteMark) & byteMask); ch >>= 6;
> -            case 1: *--target = (UTF8) (ch | firstByteMark[bytesToWrite]);
> -        }
> -        target += bytesToWrite;
> -    }
> -    *sourceStart = source;
> -    *targetStart = target;
> -    return result;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -/*
> - * Utility routine to tell whether a sequence of bytes is legal UTF-8.
> - * This must be called with the length pre-determined by the first byte.
> - * If not calling this from ConvertUTF8to*, then the length can be set by:
> - *  length = trailingBytesForUTF8[*source]+1;
> - * and the sequence is illegal right away if there aren't that many bytes
> - * available.
> - * If presented with a length > 4, this returns false.  The Unicode
> - * definition of UTF-8 goes up to 4-byte sequences.
> - */
> -
> -static Boolean isLegalUTF8(const UTF8 *source, int length) {
> -    UTF8 a;
> -    const UTF8 *srcptr = source+length;
> -    switch (length) {
> -    default: return false;
> -        /* Everything else falls through when "true"... */
> -    case 4: if ((a = (*--srcptr)) < 0x80 || a > 0xBF) return false;
> -    case 3: if ((a = (*--srcptr)) < 0x80 || a > 0xBF) return false;
> -    case 2: if ((a = (*--srcptr)) < 0x80 || a > 0xBF) return false;
> -
> -        switch (*source) {
> -            /* no fall-through in this inner switch */
> -            case 0xE0: if (a < 0xA0) return false; break;
> -            case 0xED: if (a > 0x9F) return false; break;
> -            case 0xF0: if (a < 0x90) return false; break;
> -            case 0xF4: if (a > 0x8F) return false; break;
> -            default:   if (a < 0x80) return false;
> -        }
> -
> -    case 1: if (*source >= 0x80 && *source < 0xC2) return false;
> -    }
> -    if (*source > 0xF4) return false;
> -    return true;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -/*
> - * Exported function to return whether a UTF-8 sequence is legal or not.
> - * This is not used here; it's just exported.
> - */
> -Boolean isLegalUTF8Sequence(const UTF8 *source, const UTF8 *sourceEnd) {
> -    int length = trailingBytesForUTF8[*source]+1;
> -    if (length > sourceEnd - source) {
> -        return false;
> -    }
> -    return isLegalUTF8(source, length);
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -/*
> - * Exported function to return the total number of bytes in a codepoint
> - * represented in UTF-8, given the value of the first byte.
> - */
> -unsigned getNumBytesForUTF8(UTF8 first) {
> -  return trailingBytesForUTF8[first] + 1;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -/*
> - * Exported function to return whether a UTF-8 string is legal or not.
> - * This is not used here; it's just exported.
> - */
> -Boolean isLegalUTF8String(const UTF8 **source, const UTF8 *sourceEnd) {
> -    while (*source != sourceEnd) {
> -        int length = trailingBytesForUTF8[**source] + 1;
> -        if (length > sourceEnd - *source || !isLegalUTF8(*source, length))
> -            return false;
> -        *source += length;
> -    }
> -    return true;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -ConversionResult ConvertUTF8toUTF16 (
> -        const UTF8** sourceStart, const UTF8* sourceEnd,
> -        UTF16** targetStart, UTF16* targetEnd, ConversionFlags flags) {
> -    ConversionResult result = conversionOK;
> -    const UTF8* source = *sourceStart;
> -    UTF16* target = *targetStart;
> -    while (source < sourceEnd) {
> -        UTF32 ch = 0;
> -        unsigned short extraBytesToRead = trailingBytesForUTF8[*source];
> -        if (extraBytesToRead >= sourceEnd - source) {
> -            result = sourceExhausted; break;
> -        }
> -        /* Do this check whether lenient or strict */
> -        if (!isLegalUTF8(source, extraBytesToRead+1)) {
> -            result = sourceIllegal;
> -            break;
> -        }
> -        /*
> -         * The cases all fall through. See "Note A" below.
> -         */
> -        switch (extraBytesToRead) {
> -            case 5: ch += *source++; ch <<= 6; /* remember, illegal UTF-8 */
> -            case 4: ch += *source++; ch <<= 6; /* remember, illegal UTF-8 */
> -            case 3: ch += *source++; ch <<= 6;
> -            case 2: ch += *source++; ch <<= 6;
> -            case 1: ch += *source++; ch <<= 6;
> -            case 0: ch += *source++;
> -        }
> -        ch -= offsetsFromUTF8[extraBytesToRead];
> -
> -        if (target >= targetEnd) {
> -            source -= (extraBytesToRead+1); /* Back up source pointer! */
> -            result = targetExhausted; break;
> -        }
> -        if (ch <= UNI_MAX_BMP) { /* Target is a character <= 0xFFFF */
> -            /* UTF-16 surrogate values are illegal in UTF-32 */
> -            if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_LOW_END) {
> -                if (flags == strictConversion) {
> -                    source -= (extraBytesToRead+1); /* return to the illegal value itself */
> -                    result = sourceIllegal;
> -                    break;
> -                } else {
> -                    *target++ = UNI_REPLACEMENT_CHAR;
> -                }
> -            } else {
> -                *target++ = (UTF16)ch; /* normal case */
> -            }
> -        } else if (ch > UNI_MAX_UTF16) {
> -            if (flags == strictConversion) {
> -                result = sourceIllegal;
> -                source -= (extraBytesToRead+1); /* return to the start */
> -                break; /* Bail out; shouldn't continue */
> -            } else {
> -                *target++ = UNI_REPLACEMENT_CHAR;
> -            }
> -        } else {
> -            /* target is a character in range 0xFFFF - 0x10FFFF. */
> -            if (target + 1 >= targetEnd) {
> -                source -= (extraBytesToRead+1); /* Back up source pointer! */
> -                result = targetExhausted; break;
> -            }
> -            ch -= halfBase;
> -            *target++ = (UTF16)((ch >> halfShift) + UNI_SUR_HIGH_START);
> -            *target++ = (UTF16)((ch & halfMask) + UNI_SUR_LOW_START);
> -        }
> -    }
> -    *sourceStart = source;
> -    *targetStart = target;
> -    return result;
> -}
> -
> -/* --------------------------------------------------------------------- */
> -
> -ConversionResult ConvertUTF8toUTF32 (
> -        const UTF8** sourceStart, const UTF8* sourceEnd,
> -        UTF32** targetStart, UTF32* targetEnd, ConversionFlags flags) {
> -    ConversionResult result = conversionOK;
> -    const UTF8* source = *sourceStart;
> -    UTF32* target = *targetStart;
> -    while (source < sourceEnd) {
> -        UTF32 ch = 0;
> -        unsigned short extraBytesToRead = trailingBytesForUTF8[*source];
> -        if (extraBytesToRead >= sourceEnd - source) {
> -            result = sourceExhausted; break;
> -        }
> -        /* Do this check whether lenient or strict */
> -        if (!isLegalUTF8(source, extraBytesToRead+1)) {
> -            result = sourceIllegal;
> -            break;
> -        }
> -        /*
> -         * The cases all fall through. See "Note A" below.
> -         */
> -        switch (extraBytesToRead) {
> -            case 5: ch += *source++; ch <<= 6;
> -            case 4: ch += *source++; ch <<= 6;
> -            case 3: ch += *source++; ch <<= 6;
> -            case 2: ch += *source++; ch <<= 6;
> -            case 1: ch += *source++; ch <<= 6;
> -            case 0: ch += *source++;
> -        }
> -        ch -= offsetsFromUTF8[extraBytesToRead];
> -
> -        if (target >= targetEnd) {
> -            source -= (extraBytesToRead+1); /* Back up the source pointer! */
> -            result = targetExhausted; break;
> -        }
> -        if (ch <= UNI_MAX_LEGAL_UTF32) {
> -            /*
> -             * UTF-16 surrogate values are illegal in UTF-32, and anything
> -             * over Plane 17 (> 0x10FFFF) is illegal.
> -             */
> -            if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_LOW_END) {
> -                if (flags == strictConversion) {
> -                    source -= (extraBytesToRead+1); /* return to the illegal value itself */
> -                    result = sourceIllegal;
> -                    break;
> -                } else {
> -                    *target++ = UNI_REPLACEMENT_CHAR;
> -                }
> -            } else {
> -                *target++ = ch;
> -            }
> -        } else { /* i.e., ch > UNI_MAX_LEGAL_UTF32 */
> -            result = sourceIllegal;
> -            *target++ = UNI_REPLACEMENT_CHAR;
> -        }
> -    }
> -    *sourceStart = source;
> -    *targetStart = target;
> -    return result;
> -}
> -
> -/* ---------------------------------------------------------------------
> -
> -    Note A.
> -    The fall-through switches in UTF-8 reading code save a
> -    temp variable, some decrements & conditionals.  The switches
> -    are equivalent to the following loop:
> -        {
> -            int tmpBytesToRead = extraBytesToRead+1;
> -            do {
> -                ch += *source++;
> -                --tmpBytesToRead;
> -                if (tmpBytesToRead) ch <<= 6;
> -            } while (tmpBytesToRead > 0);
> -        }
> -    In UTF-8 writing code, the switches on "bytesToWrite" are
> -    similarly unrolled loops.
> -
> -   --------------------------------------------------------------------- */
>
> Removed: cfe/trunk/lib/Basic/ConvertUTFWrapper.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/ConvertUTFWrapper.cpp?rev=173923&view=auto
> ==============================================================================
> --- cfe/trunk/lib/Basic/ConvertUTFWrapper.cpp (original)
> +++ cfe/trunk/lib/Basic/ConvertUTFWrapper.cpp (removed)
> @@ -1,76 +0,0 @@
> -//===-- ConvertUTFWrapper.cpp - Wrap ConvertUTF.h with clang data types -----===
> -//
> -//                     The LLVM Compiler Infrastructure
> -//
> -// This file is distributed under the University of Illinois Open Source
> -// License. See LICENSE.TXT for details.
> -//
> -//===----------------------------------------------------------------------===//
> -
> -#include "clang/Basic/ConvertUTF.h"
> -#include "clang/Basic/LLVM.h"
> -
> -namespace clang {
> -
> -bool ConvertUTF8toWide(unsigned WideCharWidth, llvm::StringRef Source,
> -                       char *&ResultPtr, const UTF8 *&ErrorPtr) {
> -  assert(WideCharWidth == 1 || WideCharWidth == 2 || WideCharWidth == 4);
> -  ConversionResult result = conversionOK;
> -  // Copy the character span over.
> -  if (WideCharWidth == 1) {
> -    const UTF8 *Pos = reinterpret_cast<const UTF8*>(Source.begin());
> -    if (!isLegalUTF8String(&Pos, reinterpret_cast<const UTF8*>(Source.end()))) {
> -      result = sourceIllegal;
> -      ErrorPtr = Pos;
> -    } else {
> -      memcpy(ResultPtr, Source.data(), Source.size());
> -      ResultPtr += Source.size();
> -    }
> -  } else if (WideCharWidth == 2) {
> -    const UTF8 *sourceStart = (const UTF8*)Source.data();
> -    // FIXME: Make the type of the result buffer correct instead of
> -    // using reinterpret_cast.
> -    UTF16 *targetStart = reinterpret_cast<UTF16*>(ResultPtr);
> -    ConversionFlags flags = strictConversion;
> -    result = ConvertUTF8toUTF16(
> -        &sourceStart, sourceStart + Source.size(),
> -        &targetStart, targetStart + 2*Source.size(), flags);
> -    if (result == conversionOK)
> -      ResultPtr = reinterpret_cast<char*>(targetStart);
> -    else
> -      ErrorPtr = sourceStart;
> -  } else if (WideCharWidth == 4) {
> -    const UTF8 *sourceStart = (const UTF8*)Source.data();
> -    // FIXME: Make the type of the result buffer correct instead of
> -    // using reinterpret_cast.
> -    UTF32 *targetStart = reinterpret_cast<UTF32*>(ResultPtr);
> -    ConversionFlags flags = strictConversion;
> -    result = ConvertUTF8toUTF32(
> -        &sourceStart, sourceStart + Source.size(),
> -        &targetStart, targetStart + 4*Source.size(), flags);
> -    if (result == conversionOK)
> -      ResultPtr = reinterpret_cast<char*>(targetStart);
> -    else
> -      ErrorPtr = sourceStart;
> -  }
> -  assert((result != targetExhausted)
> -         && "ConvertUTF8toUTFXX exhausted target buffer");
> -  return result == conversionOK;
> -}
> -
> -bool ConvertCodePointToUTF8(unsigned Source, char *&ResultPtr) {
> -  const UTF32 *SourceStart = &Source;
> -  const UTF32 *SourceEnd = SourceStart + 1;
> -  UTF8 *TargetStart = reinterpret_cast<UTF8 *>(ResultPtr);
> -  UTF8 *TargetEnd = TargetStart + 4;
> -  ConversionResult CR = ConvertUTF32toUTF8(&SourceStart, SourceEnd,
> -                                           &TargetStart, TargetEnd,
> -                                           strictConversion);
> -  if (CR != conversionOK)
> -    return false;
> -
> -  ResultPtr = reinterpret_cast<char*>(TargetStart);
> -  return true;
> -}
> -
> -} // end namespace clang
>
> Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExpr.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/CodeGen/CGExpr.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGExpr.cpp Wed Jan 30 06:06:08 2013
> @@ -21,13 +21,14 @@
>  #include "TargetInfo.h"
>  #include "clang/AST/ASTContext.h"
>  #include "clang/AST/DeclObjC.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Frontend/CodeGenOptions.h"
>  #include "llvm/ADT/Hashing.h"
>  #include "llvm/IR/DataLayout.h"
>  #include "llvm/IR/Intrinsics.h"
>  #include "llvm/IR/LLVMContext.h"
>  #include "llvm/IR/MDBuilder.h"
> +#include "llvm/Support/ConvertUTF.h"
> +
>  using namespace clang;
>  using namespace CodeGen;
>
>
> Modified: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/CodeGen/CodeGenModule.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp Wed Jan 30 06:06:08 2013
> @@ -30,7 +30,6 @@
>  #include "clang/AST/RecordLayout.h"
>  #include "clang/AST/RecursiveASTVisitor.h"
>  #include "clang/Basic/Builtins.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Basic/Diagnostic.h"
>  #include "clang/Basic/Module.h"
>  #include "clang/Basic/SourceManager.h"
> @@ -44,8 +43,10 @@
>  #include "llvm/IR/LLVMContext.h"
>  #include "llvm/IR/Module.h"
>  #include "llvm/Support/CallSite.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/ErrorHandling.h"
>  #include "llvm/Target/Mangler.h"
> +
>  using namespace clang;
>  using namespace CodeGen;
>
>
> Modified: cfe/trunk/lib/Frontend/TextDiagnostic.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/TextDiagnostic.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Frontend/TextDiagnostic.cpp (original)
> +++ cfe/trunk/lib/Frontend/TextDiagnostic.cpp Wed Jan 30 06:06:08 2013
> @@ -8,13 +8,13 @@
>  //===----------------------------------------------------------------------===//
>
>  #include "clang/Frontend/TextDiagnostic.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Basic/DiagnosticOptions.h"
>  #include "clang/Basic/FileManager.h"
>  #include "clang/Basic/SourceManager.h"
>  #include "clang/Lex/Lexer.h"
>  #include "llvm/ADT/SmallString.h"
>  #include "llvm/ADT/StringExtras.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/ErrorHandling.h"
>  #include "llvm/Support/Locale.h"
>  #include "llvm/Support/MemoryBuffer.h"
>
> Modified: cfe/trunk/lib/Lex/Lexer.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Lex/Lexer.cpp (original)
> +++ cfe/trunk/lib/Lex/Lexer.cpp Wed Jan 30 06:06:08 2013
> @@ -25,7 +25,6 @@
>  //===----------------------------------------------------------------------===//
>
>  #include "clang/Lex/Lexer.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Basic/SourceManager.h"
>  #include "clang/Lex/CodeCompletionHandler.h"
>  #include "clang/Lex/LexDiagnostic.h"
> @@ -34,6 +33,7 @@
>  #include "llvm/ADT/StringExtras.h"
>  #include "llvm/ADT/StringSwitch.h"
>  #include "llvm/Support/Compiler.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/MemoryBuffer.h"
>  #include <cstring>
>  using namespace clang;
> @@ -1655,10 +1655,11 @@ FinishIdentifier:
>      } else if (!isASCII(C)) {
>        const char *UnicodePtr = CurPtr;
>        UTF32 CodePoint;
> -      ConversionResult Result = convertUTF8Sequence((const UTF8 **)&UnicodePtr,
> -                                                    (const UTF8 *)BufferEnd,
> -                                                    &CodePoint,
> -                                                    strictConversion);
> +      ConversionResult Result =
> +          llvm::convertUTF8Sequence((const UTF8 **)&UnicodePtr,
> +                                    (const UTF8 *)BufferEnd,
> +                                    &CodePoint,
> +                                    strictConversion);
>        if (Result != conversionOK ||
>            !isAllowedIDChar(static_cast<uint32_t>(CodePoint)))
>          goto FinishIdentifier;
> @@ -3528,10 +3529,11 @@ LexNextToken:
>      // We can't just reset CurPtr to BufferPtr because BufferPtr may point to
>      // an escaped newline.
>      --CurPtr;
> -    ConversionResult Status = convertUTF8Sequence((const UTF8 **)&CurPtr,
> -                                                  (const UTF8 *)BufferEnd,
> -                                                  &CodePoint,
> -                                                  strictConversion);
> +    ConversionResult Status =
> +        llvm::convertUTF8Sequence((const UTF8 **)&CurPtr,
> +                                  (const UTF8 *)BufferEnd,
> +                                  &CodePoint,
> +                                  strictConversion);
>      if (Status == conversionOK)
>        return LexUnicode(Result, CodePoint, CurPtr);
>
>
> Modified: cfe/trunk/lib/Lex/LiteralSupport.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/LiteralSupport.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Lex/LiteralSupport.cpp (original)
> +++ cfe/trunk/lib/Lex/LiteralSupport.cpp Wed Jan 30 06:06:08 2013
> @@ -13,12 +13,13 @@
>  //===----------------------------------------------------------------------===//
>
>  #include "clang/Lex/LiteralSupport.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Basic/TargetInfo.h"
>  #include "clang/Lex/LexDiagnostic.h"
>  #include "clang/Lex/Preprocessor.h"
>  #include "llvm/ADT/StringExtras.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/ErrorHandling.h"
> +
>  using namespace clang;
>
>  static unsigned getCharWidth(tok::TokenKind kind, const TargetInfo &Target) {
>
> Modified: cfe/trunk/lib/Lex/Preprocessor.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Preprocessor.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Lex/Preprocessor.cpp (original)
> +++ cfe/trunk/lib/Lex/Preprocessor.cpp Wed Jan 30 06:06:08 2013
> @@ -27,7 +27,6 @@
>
>  #include "clang/Lex/Preprocessor.h"
>  #include "MacroArgs.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Basic/FileManager.h"
>  #include "clang/Basic/SourceManager.h"
>  #include "clang/Basic/TargetInfo.h"
> @@ -47,6 +46,7 @@
>  #include "llvm/ADT/STLExtras.h"
>  #include "llvm/ADT/StringExtras.h"
>  #include "llvm/Support/Capacity.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/MemoryBuffer.h"
>  #include "llvm/Support/raw_ostream.h"
>  using namespace clang;
> @@ -501,7 +501,7 @@ static void appendCodePoint(unsigned Cod
>                              llvm::SmallVectorImpl<char> &Str) {
>    char ResultBuf[4];
>    char *ResultPtr = ResultBuf;
> -  bool Res = ConvertCodePointToUTF8(Codepoint, ResultPtr);
> +  bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr);
>    (void)Res;
>    assert(Res && "Unexpected conversion failure");
>    Str.append(ResultBuf, ResultPtr);
>
> Modified: cfe/trunk/lib/Sema/SemaChecking.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaChecking.cpp?rev=173924&r1=173923&r2=173924&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Sema/SemaChecking.cpp (original)
> +++ cfe/trunk/lib/Sema/SemaChecking.cpp Wed Jan 30 06:06:08 2013
> @@ -24,7 +24,6 @@
>  #include "clang/AST/StmtCXX.h"
>  #include "clang/AST/StmtObjC.h"
>  #include "clang/Analysis/Analyses/FormatString.h"
> -#include "clang/Basic/ConvertUTF.h"
>  #include "clang/Basic/TargetBuiltins.h"
>  #include "clang/Basic/TargetInfo.h"
>  #include "clang/Lex/Preprocessor.h"
> @@ -35,6 +34,7 @@
>  #include "llvm/ADT/BitVector.h"
>  #include "llvm/ADT/STLExtras.h"
>  #include "llvm/ADT/SmallString.h"
> +#include "llvm/Support/ConvertUTF.h"
>  #include "llvm/Support/raw_ostream.h"
>  #include <limits>
>  using namespace clang;
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

-- 
Timur Iskhodzhanov,
Google Russia